www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Best way to manage non-memory resources in current D, ex: database

reply Chad Joan <chadjoan gmail.com> writes:
Hello all,

I intend to write a range that iterates over a non-memory 
resource.

Incase you want something more concrete:
I'll be using the Windows ODBC API and calling SQLFetch 
repeatedly on a statement handle allocated with SQLAllocHandle 
and deallocated with SQLFreeHandle.

But in general:
I want to call some function (ex: SQLAllocHandle) at the 
beginning of my range's lifetime and call another function (ex: 
SQLFreeHandle) at the end of my range's lifetime.  Unlike memory 
resources, reference cycles are unlikely (or impossible), and it 
is highly desirable to free the resource as soon as possible 
(deterministic allocation/deallocation).

What's the best way to implement such a range in current D?

It seems things have changed since back when I used D and we used 
the scope attribute on a variable or struct/class declaration to 
at least mostly fake it (with limitations, of course).

I'm seeing these possibilities so far:
1.) Use std.typecons's Unique or RefCounted templates.
2.) Put a deallocate() method on my range struct/class. Call it 
in a scope(exit) statement.
3.) Use a struct for my range. Put a deallocate() method. Call 
deallocate() from ~this().

Please let me know if there are other things in the current D 
ecosystem intended to solve this problem.

On what I've considered, I'll break my thoughts into sections.

==============================================================
=== (1) Use std.typecons's Unique or RefCounted templates. ===
==============================================================

Sorry if this one's long.  Skip it if you don't know how these 
are intended to be used.

------ Unique ------

The documentation states about Unique:
"Encapsulates unique ownership of a resource. Resource of type T 
is deleted at the end of the scope, unless it is transferred. The 
transfer can be explicit, by calling release, or implicit, when 
returning Unique from a function. The resource can be a 
polymorphic class object, in which case Unique behaves 
polymorphically too."
Source: https://dlang.org/phobos/std_typecons.html#.Unique

How does it "delete" the resource T?  What method does it call to 
tell T to deallocate itself?  When I look at the source, it looks 
like Unique calls destroy(_p), where _p is presumably the wrapped 
instance of T.

The documentation from the destroy function says this:
"Destroys the given object and puts it in an invalid state. It's 
used to destroy an object so that any cleanup which its 
destructor or finalizer does is done and so that it no longer 
references any other objects. It does not initiate a GC cycle or 
free any GC memory."
Source: https://dlang.org/library/object/destroy.html

The destroy function doesn't mention what methods specifically 
will be executed on the target object, other than "destructor or 
finalizer".  I don't know what a finalizer is in D and cursory 
internet searching brings up nothing.  As for destructors, the 
docs say that "when the garbage collector calls a destructor for 
an object of a class that has members that are references to 
garbage collected objects, those references may no longer be 
valid."  (https://dlang.org/spec/class.html#destructors)  I 
originally thought that the entire class contents would be 
invalid during destruction, but maybe we are lucky, maybe it is 
JUST references to GC memory that become invalid.

Is the destructor of a class actually an appropriate place to 
deallocate non-memory resources, at least if Unique!T is used, or 
is this going to be fragile?

------ RefCounted ------

RefCounted: The documentation says:
"Defines a reference-counted object containing a T value as 
payload. RefCounted keeps track of all references of an object, 
and when the reference count goes down to zero, frees the 
underlying store. RefCounted uses malloc and free for operation.

RefCounted is unsafe and should be used with care. No references 
to the payload should be escaped outside the RefCounted object.

... some stuff about autoInit ... "

So I have similar questions about this as I did Unique: How does 
it "free" the resource T?  What method does it call to tell T to 
deallocate itself?

The documentation suggests that it uses malloc/free, which, if I 
am guessing right, would mean that the underlying T would have 
absolutely no chances to free underlying resources.  Its memory 
would get free'd and that's EOL for the instance, no mulligans.

The source code seems to suggest that it calls destroy(...) like 
Unique does.

Is RefCounted intended to be used with non-memory resources, or 
is it strictly a memory manager using malloc/free and reference 
counts?

I also worry about RefCounted's unconditional dependency on 
mallocator functions (and possibly also the pureGcXxxx 
functions).  Maybe that is a feature request for another day.

==============================================================
=== (2)  Put a deallocate() method; call it in scope(exit) ===
==============================================================

import std.stdio;

struct MyRange
{
	static MyRange allocate() {
		MyRange r;
		writeln("Allocate MyRange.");
		return r;
	}

	void deallocate() { writeln("Deallocate MyRange."); }
}

int foo(bool doThrow)
{
	auto a = MyRange.allocate();
	
	scope(exit) a.deallocate();
	
	if ( doThrow )
		throw new Exception("!");
	else
		return 42;
}

void main()
{
	foo(false);
	foo(true);
}

I'm leaning towards this methodology; it is simple and doesn't 
make a lot of assumptions.  For this project, my usage should 
have pretty straightforward object lifetimes; defining these 
lifetimes with function scopes should be sufficient.

I still wonder if there is a better way, especially if I am not 
fortunate enough to get easy program requirements in the future.

==============================================================
===  (3)  Put a deallocate() method; call it in ~this()    ===
==============================================================

import std.stdio;

struct MyRange
{
	static MyRange allocate() {
		MyRange r;
		writeln("Allocate MyRange.");
		return r;
	}

	~this() { deallocate(); }
	void deallocate() { writeln("Deallocate MyRange."); }
}

int foo(bool doThrow)
{
	auto a = MyRange.allocate();
	if ( doThrow )
		throw new Exception("!");
	else
		return 42;
}

void main()
{
	foo(false);
	foo(true);
}

This seems to work.  Any caveats?

I suppose this will have the similar drawbacks to (2) : if the 
lifetime of the objects becomes more complicated than function 
scope, then this may not work well enough.

==============================================================

Thanks!
- Chad
Mar 08
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 8 March 2017 at 23:54:56 UTC, Chad Joan wrote:
 What's the best way to implement such a range in current D?
I'd go with a struct with disabled copying and default construction, then make the destructor free it and the function that returns it populate it. So basically Unique.
 The destroy function doesn't mention what methods specifically 
 will be executed on the target object, other than "destructor 
 or finalizer".
It calls the `~this()` function, aka the destructor. It is just sometimes called a finalizer in other contexts too. Same thing, different name.
 So I have similar questions about this as I did Unique: How 
 does it "free" the resource T?  What method does it call to 
 tell T to deallocate itself?
So it doesn't deallocate itself, but it can deallocate its members. It calls ~this(); first, your destructor, and you can free its members with that. Then it calls `free()` on the outer object pointer itself. So clean up the members in the destructor and you should be good. Basically the same deal as with Unique.
 I'm leaning towards this methodology;
This is good, but it is easy to forget the scope(exit) too. That said, this is how my database.d handles its connection classes. If your connection is a struct, using the destructor is better (your option #3), since it just does this automatically - a struct destructor (unless it is in a dynamic array or some other kind of pointer) is automatically called on scope exit. Classes, though, do not get their dtors called then - they wait until they are GC'd - so scope(exit) does a good job with getting them cleaned up faster.
 ===  (3)  Put a deallocate() method; call it in ~this()
This is what my database.d does with the query results. The database connection class is polymorphic and thus doesn't work well as a struct, but the query result worked beautifully as a struct and the destructor handles it. I also threw in ` disable this(this);` to ensure it isn't copied somewhere so I don't have to refcount it or anything annoying like that. On the other hand, I must consume the query in-place (or pass it by pointer to other functions being careful not to keep it after the outer function returns)... but that's what I want to do anyway. So my code looks something like this: class Database { this(string conn) { this.handle = establish_connection(conn); if(this.handle is null) throw new Exception(); } ~this() { close_connection(this.handle); } Result query(string sql, string[] args) { // hugely simplified auto q = prepare_query(sql); bind_args(q, args); // the Result is constructed here and only here return Result(execute_query(q)); } } struct Result { // no business default constructing it ever disable this(); // forbid copying it. Instead, pass it by pointer // or just loop it in-place and copy the results // elsewhere for storage. disable this(this); // private constructor since only the query // method above should be making these private this(c_query_handle handle) { this.handle = handle; // do whatever other allocation needs // to be done via C functions // .... popFront(); // prime the first result } ~this() { // destroy whatever C resources this holds destroy_query_handle(this.handle); } Row front() { return makeDFriendly(this.current_row); } void popFront() { this.current_row = fetch_next_row(this.handle); } bool empty() { return has_more_rows(this.handle); } } Then use it like this: void main() { auto db = new Database("db=test"); scope(exit) .destroy(db); foreach(row; db.query("select * from foo", null)) { // work with row } } Now, if you forget to scope(exit), it is OK, the garbage collector WILL get around to it eventually, and it is legal to work with C handles and functions from a destructor. It is only illegal to call D's garbage collector's functions or to reference memory managed by D's GC inside the destructor. C pointers are fine. It is just nicer to close the connection at a more specified time.
Mar 08
parent Chad Joan <chadjoan gmail.com> writes:
Awesome, thank you!

On Thursday, 9 March 2017 at 00:47:48 UTC, Adam D. Ruppe wrote:
 Now, if you forget to scope(exit), it is OK, the garbage 
 collector WILL get around to it eventually, and it is legal to 
 work with C handles and functions from a destructor. It is only 
 illegal to call D's garbage collector's functions or to 
 reference memory managed by D's GC inside the destructor. C 
 pointers are fine.
It's good to have this confirmed. I'm always a bit trepidatious around destructors. Oooh, and it looks like there is more information in the language spec about disable on struct constructors and postblits now (compared to the previous time I tried to learn about that feature). So between that and your example, I think I have a feel for how to use that. Thanks. Have a wonderful day!
Mar 08
prev sibling parent reply Kagamin <spam here.lot> writes:
Unique is probably not good for database connection: you then 
can't have connection in two variables, also if it holds a 
reference to GC-allocated memory, it can't be put to GC-allocated 
memory, since when that GC-allocated memory is collected, Unique 
will try to destroy its possibly already freed object resulting 
in use after free. RefCounted is ok, it has calls to GC in order 
to be scanned for GC references, though destructors during 
collection are called in a different thread, so the counter can 
be decremented incorrectly. Phobos tends to have specialized 
types like File and RcString, but they have the same problem with 
reference counting during collections.
Mar 09
parent Kagamin <spam here.lot> writes:
 RefCounted is ok
If GC methods it calls are legal during collection.
Mar 09