www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Using Postgres connection functions

reply Joe <jma freedomcircle.com> writes:
I'm trying to learn how to use D to connect (and send queries) to 
Postgres, i.e., libpq in C. Postgres has three families of 
connection functions: PQsetdbLogin which takes multiple 
individual arguments (all as const char *), PQconnectdb which 
takes a single connection string (which Postgres then has to 
parse into keywords/values) and PQconnectdbParams (introduced 
with PG 9.0 in 2010) which takes two arrays of char pointers, one 
for the keywords and another for the values, i.e., it's 
pre-parsed, and I believe, more convenient for callers since it's 
most likely that they get the parameter values from command line 
arguments, environment values or UI dialogs, so it saves the 
connection string formatting on the caller side and the 
parsing/decomposing on the library side.

In spite of the latter considerations, it appears most D 
libraries for Postgres only support connection string usage (only 
derelict-pq has a declaration for PQconnectdbParams--but I 
haven't tested it).

In any case, I tried to use PQconnectdbParams by first declaring 
it, as per the Postgres manual, with

extern(C)
{
     struct PGconn;

     PGconn *PQconnectdbParams(const char * const * keywords,
                               const char * const * values,
                               int expand_dbname);
}

This caused ldc2, on Linux, to complain as follows:

connection.d(30): Error: found 'const' when expecting ')'
connection.d(30): Error: semicolon expected following function 
declaration
connection.d(30): Error: declaration expected, not '*'

It only compiled after I removed the second 'const' in the first 
and second arguments.

The hard thing was figuring out how to pass the keyword/values 
arrays, defined as:

string[] keywords = ["hostaddr", "port", "dbname"];
string[] values = ["127.0.0.1", "5432", "testdb"];

to the D invocation of PQconnectdbParams.  I ended up with the 
following, which looks like covering a patient with lots of 
bandaids and looking the other way (and I had to modify the 
arrays above with a fixed length).

     extern(C) char * [keywords.length + 1] kws;
     extern(C) char * [keywords.length + 1] vals;
     foreach (i, k; keywords) {
         kws[i] = cast(char *)toStringz(k);
         vals[i] = cast(char *)toStringz(values[i]);
     }
     kws[keywords.length] = null;  // Postgres wants the NULL 
termination
     vals[keywords.length] = null;

     conn = PQconnectdbParams(cast(const char **)kws,
                              cast(const char **)vals, 1);

I assume that in a real program I'd have to malloc the C arrays, 
since I wouldn't know beforehand how many parameters there would 
be (or I could define them with a static size of about 30 since 
that's how many connection parameters are recognized currently).

So my question is: is there an easier or better way of passing 
two arrays of C null-terminated strings to an extern(C) function?
Jan 12
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 13 January 2018 at 04:17:02 UTC, Joe wrote:
 It only compiled after I removed the second 'const' in the 
 first and second arguments.
Yeah, D's const applies down the chain automatically, so you don't write it twice there.
 string[] keywords = ["hostaddr", "port", "dbname"];
 string[] values = ["127.0.0.1", "5432", "testdb"];
     conn = PQconnectdbParams(cast(const char **)kws,
                              cast(const char **)vals, 1);

 So my question is: is there an easier or better way of passing 
 two arrays of C null-terminated strings to an extern(C) 
 function?
If and only if the values are known at compile time, you can do: const char** keywords = ["hostaddr".ptr, "port".ptr, "dbname".ptr, null].ptr; or even do it inline: PQconnectdbParams(["hostaddr".ptr, "port".ptr, "dbname".ptr, null].ptr, ditto_for_values, 1); Otherwise, what you did there is decent... being a C style of array of arrays, it will need to be coded in a C style with stuff like malloc and toStringz to convert D to C strings too.
Jan 12
parent reply Joe <jma freedomcircle.com> writes:
On Saturday, 13 January 2018 at 04:26:06 UTC, Adam D. Ruppe wrote:
 If and only if the values are known at compile time, you can do:

 const char** keywords = ["hostaddr".ptr, "port".ptr, 
 "dbname".ptr, null].ptr;

 or even do it inline:


 PQconnectdbParams(["hostaddr".ptr, "port".ptr, "dbname".ptr, 
 null].ptr, ditto_for_values, 1);
The keywords are (or could be) known at compile time, but almost by definition, the associated values are only known at runtime.
Jan 12
parent reply Joe <jma freedomcircle.com> writes:
Going beyond the connection, there are various other libpq 
functions that use a similar pattern of values passed using 
multiple parallel C arrays, e.g.,

    PGresult *PQexecParams(PGconn *conn,
                        const char *command,
                        int nParams,
                        const Oid *paramTypes,
                        const char * const *paramValues,
                        const int *paramLengths,
                        const int *paramFormats,
                        int resultFormat);

Each of the `paramXxxxs' arguments are arrays (Oid is an alias 
for uint).

    PGresult *PQprepare(PGconn *conn,
                     const char *stmtName,
                     const char *query,
                     int nParams,
                     const Oid *paramTypes);

    PGresult *PQexecPrepared(PGconn *conn,
                          const char *stmtName,
                          int nParams,
                          const char * const *paramValues,
                          const int *paramLengths,
                          const int *paramFormats,
                          int resultFormat);

My point is that there seems to be a need to have a generic or 
generalized mechanism for passing these argument arrays from D to 
C.
Jan 12
parent reply Joe <jma freedomcircle.com> writes:
On Saturday, 13 January 2018 at 05:28:17 UTC, Joe wrote:
 Going beyond the connection, there are various other libpq 
 functions that use a similar pattern of values passed using 
 multiple parallel C arrays, e.g.,

    PGresult *PQexecParams(PGconn *conn,
                        const char *command,
                        int nParams,
                        const Oid *paramTypes,
                        const char * const *paramValues,
                        const int *paramLengths,
                        const int *paramFormats,
                        int resultFormat);

 Each of the `paramXxxxs' arguments are arrays (Oid is an alias 
 for uint).
 [...]
Focusing on the above function, suppose the first two parameter arrays are defined in a C program as: Oid paramTypes[] = {23, 25}; char *paramValues[] = {"1", "abcde"}; which could be expressed in D as: Oid [] paramTypes = [23, 25]; string [] paramValues = ["1", "abcde"]; I know the paramTypes could be passed as null, letting PG deduce the data types but suppose we have some reason to accumulate types, etc., in D slices. I know the paramValues can be manipulated in a manner similar to the one shown in my first post, namely something like this: extern(C) char * [2] pvs; foreach (i, val; paramValues) pvs[i] = cast(char *)toStringz(val); And then use "cast(const char **)pvs" for the paramValues argument. The only feedback that I received that was explicit to this approach was in response to my initial post, in which Adam D. Ruppe said that what I did was "decent". So I have two lingering doubts: 1. Is malloc() the only way to allocate the arrays, either of Oid's, int's or char *'s, for passing to the libpq functions? IOW, isn't there a way to take advantage of D's 'new' (and thus the GC)? 2. How to convert a slice of Oid's or int's to an array of pointers suitable by processing by the libpq C function? A technique similar to the previous one seems to work, e.g., extern(C) Oid [2] pts; foreach (i, typ; paramTypes) pts[i] = typ; But I'm not sure if it's really working (when I mistakenly had a * in the pts declaration, at one point it also seemed to work).
Jan 19
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 20 January 2018 at 04:09:01 UTC, Joe wrote:
    extern(C) char * [2] pvs;
    foreach (i, val; paramValues)
        pvs[i] = cast(char *)toStringz(val);

 And then use "cast(const char **)pvs" for the paramValues 
 argument.
A slight improvement here that removes the need for any casts (and the extern(C) is useless here btw) is to just define it as const(char)*: const(char) * [2] pvs; foreach (i, val; paramValues) pvs[i] = toStringz(val); Then you can call the argument with plain `pvs.ptr` instead of casting it. Though, make sure you remember the null at the end too!
 1. Is malloc() the only way to allocate the arrays, either of 
 Oid's, int's or char *'s, for passing to the libpq functions?  
 IOW, isn't there a way to take advantage of D's 'new' (and thus 
 the GC)?
const(char)*[] pvs = new const(char)*[](size_here); // could also have been `auto pvs = ...;` btw // then still pass it to pqFunc(pvs.ptr) // the .ptr gets the char** out of it. Just double check the documentation of any C function you pass it to to make sure it doesn't explicitly say you must malloc it. Some C functions will try to free() the pointer you pass it, or may hold on to it. D's GC can't see pointers C functions keep, so it might try to free that array any time after the local variable `pvs` in D goes out of scope.
 2. How to convert a slice of Oid's or int's to an array of 
 pointers suitable by processing by the libpq C function?
Same as above. The general pattern is: C_Type[] name = new C_Type[](requested_size); // pass as `name.ptr`. This becomes a C_Type* Just again, remember requested_size needs to include the null terminator when C requires it, so a +1 might be helpful. And also, no need for extern(C). That's more for function declarations/pointers or maybe global variables C needs to be able to see.
 But I'm not sure if it's really working (when I mistakenly had 
 a * in the pts declaration, at one point it also seemed to 
 work).
Oh yikes, that would have been undefined behavior there. So [x] on a pointer will just access memory past it - just like in C - and might sometimes work, might sometimes overwrite other memory (in your case, it probably overwrote some other local variable on the stack), or, best case scenario really, might cause a segmentation fault or access violation.
Jan 19
parent Joe <jma freedomcircle.com> writes:
On Saturday, 20 January 2018 at 04:54:47 UTC, Adam D. Ruppe wrote:
 Same as above. The general pattern is:

 C_Type[] name = new C_Type[](requested_size);
 // pass as `name.ptr`. This becomes a C_Type*
Thanks, Adam. Perhaps something like this ought to make its way into the "D for C Programmers" page.
Jan 20
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2018-01-13 05:17, Joe wrote:
 I'm trying to learn how to use D to connect (and send queries) to 
 Postgres, i.e., libpq in C.
 So my question is: is there an easier or better way of passing two 
 arrays of C null-terminated strings to an extern(C) function?
There's a native D library, ddb [1], for connecting to Postgres. Then you don't have to worry about null-terminated strings. [1] http://code.dlang.org/packages/ddb -- /Jacob Carlborg
Jan 13
parent reply Joe <jma freedomcircle.com> writes:
On Saturday, 13 January 2018 at 10:10:41 UTC, Jacob Carlborg 
wrote:
 There's a native D library, ddb [1], for connecting to 
 Postgres. Then you don't have to worry about null-terminated 
 strings.
There are several D libraries that I would consider "native": derelict-pq, dpq, dpq2 and ddb. The latter perhaps has the distinction that it doesn't use libpq, but rather implements the Postgres FE/BE protocol. That's a bit *too* native for my taste. It means the library maintainer has to keep up with changes to the internal protocol, which although published, the Postgres group doesn't have to maintain compatibility from version to version. For example, they haven't dropped the PQsetdbLogin function even though the PQconnectdb and PQconnectdbParams functions are obviously preferred. OTOH, there used to be an AsciiRow message format in the protocol, that was dropped, unceremoniously (not even mentioned in the release notes).
Jan 13
next sibling parent reply Matthias Klumpp <mak debian.org> writes:
On Saturday, 13 January 2018 at 17:58:14 UTC, Joe wrote:
 On Saturday, 13 January 2018 at 10:10:41 UTC, Jacob Carlborg 
 wrote:
 There's a native D library, ddb [1], for connecting to 
 Postgres. Then you don't have to worry about null-terminated 
 strings.
There are several D libraries that I would consider "native": derelict-pq, dpq, dpq2 and ddb. The latter perhaps has the distinction that it doesn't use libpq, but rather implements the Postgres FE/BE protocol. That's a bit *too* native for my taste. It means the library maintainer has to keep up with changes to the internal protocol, which although published, the Postgres group doesn't have to maintain compatibility from version to version. For example, they haven't dropped the PQsetdbLogin function even though the PQconnectdb and PQconnectdbParams functions are obviously preferred. OTOH, there used to be an AsciiRow message format in the protocol, that was dropped, unceremoniously (not even mentioned in the release notes).
If you are after a good way to use Postgres in a real-world application, I highly recommend ddbc[1] (which also supports other backends). There are a lot of D Postgres bindings out there, and all of them are about 70% completed, but nobody really bothered to make one finished and really good (and well maintained) binding. DDBC is really close to being complete, and contains a few convenience features that make it nice to use in an application. It also is used by Hibernated[2] in case you want an ORM for your app at some point. Both libraries aren't up to tools like SQLAlchemy & Co. from other programming languages, but they are decent. For simple cases, dpq2 & Co. might work well enough as well. In any case, please don't start another Postgres library and consider contributing to one of the existing ones, so that we maybe have one really awesome, 100% complete library at some point. If, on the other hand, your goal is to learn about the low-level Postgres interface and not just to have a Postgres interface for an application you develop, by all means, play with it :-) Cheers, Matthias [1]: https://github.com/buggins/ddbc [2]: https://github.com/buggins/hibernated
Jan 14
parent Joe <jma freedomcircle.com> writes:
On Monday, 15 January 2018 at 02:28:29 UTC, Matthias Klumpp wrote:
 In any case, please don't start another Postgres library and 
 consider contributing to one of the existing ones, so that we 
 maybe have one really awesome, 100% complete library at some 
 point.

 If, on the other hand, your goal is to learn about the 
 low-level Postgres interface and not just to have a Postgres 
 interface for an application you develop, by all means, play 
 with it :-)
At this point, I am indeed learning about low-level Postgres interfaces (but not so low-level as the client-server protocol) as a way to understand the challenges of interfacing D to C. However, as part of the Pyrseas open source project, which I maintain, I had started to create a Postgres interface in Python inspired by The Third Manifesto, as opposed to ORMs like SQLAlchemy (see https://pyrseas.wordpress.com/2013/03/07/a-pythonic-ttm-inspired-interface-to-postgr sql-requirements/). I got criticized for "reinventing the wheel" but I believe TTM, if properly done, is quite different from an ORM. I understand your concern about not starting another PG library. From the cursory investigation of the existing libraries, I think they span a spectrum, with ddb at one end (low-level protocol), then derelict-pq (low-level binding over libpq), ddbc at the opposite end (multi-DBMS support) and several others in between. So I guess the real problem is with the proliferation in the latter group.
Jan 15
prev sibling parent Boris-Barboris <ismailsiege gmail.com> writes:
On Saturday, 13 January 2018 at 17:58:14 UTC, Joe wrote:
 ...ddb. The latter perhaps has the distinction that it doesn't 
 use libpq, but rather implements the Postgres FE/BE protocol. 
 That's a bit *too* native for my taste. It means the library 
 maintainer has to keep up with changes to the internal 
 protocol, which although published, the Postgres group doesn't 
 have to maintain compatibility from version to version.
Not that it matters, but client-server protocol is actually the most stable one, it hasn't changed since Postgress 7.4 (Release date: 2003-11-17). It's the language-level abstractions like libpq that keep being changed\updated on almost each release.
Jan 16