www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Native D MySQL Driver

reply Steve Teale <steve.teale britseyeview.com> writes:
I'm getting pretty close to having a reasonably comprehensive native D 
driver for MySQL - no translated header files, no linking to libraries.

However there are three blocking factors, features missing in Phobos that 
would stop it's use as things stand.

1) No UNIX Socket support in std.socket.
2) No Windows Named Pipe support that I've noticed.
3) No SHA1 digest support

I also have a question. MySQL potentially produces result sets of up to 
64 bit size - rows. The MySQL client theoretically copes with these by 
storing a result set as a doubly linked list, and there's nothing to stop 
me doing the same. But is it necessary?

The whole point of a database seems to me to be to access the data you 
need, rather than all the data in the Universe at the same time. Even if 
I opt for result sets limited to 32 bit row numbers, if you pulled one 
from the server with rows of only 20 bytes, you'd still be talking about 
80Gb of memory. Does any application do that sort of thing.

Even if it did the driver can still produce the rows one at a time so 
that the application could devise its own storage strategy (perhaps it 
could write the rows to a temporary database table ;=))

Steve
Nov 03 2011
next sibling parent mta`chrono <chrono mta-international.net> writes:
 1) No UNIX Socket support in std.socket.

Someone has added unix domain sockets in one of the phobos forks. I even think there's a pending pull request.
Nov 03 2011
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 3, 2011, at 10:15 PM, Steve Teale wrote:
=20
 I also have a question. MySQL potentially produces result sets of up =

 64 bit size - rows. The MySQL client theoretically copes with these by=20=

 storing a result set as a doubly linked list, and there's nothing to =

 me doing the same. But is it necessary?
=20
 The whole point of a database seems to me to be to access the data you=20=

 need, rather than all the data in the Universe at the same time. Even =

 I opt for result sets limited to 32 bit row numbers, if you pulled one=20=

 from the server with rows of only 20 bytes, you'd still be talking =

 80Gb of memory. Does any application do that sort of thing.

I think they probably did this simply to avoid imposing an artificial = limitation on the number of rows that may be returned, regardless of = whether anyone would actually do so. Practically speaking=85 an app = designed to move the entire contents of a table from one DB to another = might do this, though it would be far from the most efficient means of = doing so. I can't think of any realistic example of an app that would = want more than 2^32 rows in a response.
Nov 04 2011
prev sibling next sibling parent bls <bizprac orange.fr> writes:
On 11/03/2011 10:15 PM, Steve Teale wrote:
 2) No Windows Named Pipe support that I've noticed.

Isn't it part of std.windows ? Named piped support is part of kernel32.dll maybe this link helps > http://msdn.microsoft.com/en-us/library/windows/desktop/aa365150(v=vs.85).aspx
 3) No SHA1 digest support

The is a MIT X11 licensed lib on dsource called dcryot. A public domain C lib is available under > http://libtom.org/?page=features&newsitems=5&whatfile=crypt and finally (like it) a Javascript snippet, no licence information! > http://www.webtoolkit.info/javascript-sha1.html HTH Bjoern
Nov 04 2011
prev sibling next sibling parent reply Piotr Szturmaj <bncrbme jadamspam.pl> writes:
 I also have a question. MySQL potentially produces result sets of up to
 64 bit size - rows. The MySQL client theoretically copes with these by
 storing a result set as a doubly linked list, and there's nothing to stop
 me doing the same. But is it necessary?

Why not return a range, that will fetch next row when needed? popFront() will call socket's recv, parse and then store that row for the front() calls. This may be internally buffered to 100 rows or so. Example: https://github.com/pszturmaj/ddb/blob/master/postgres.d#L2145
Nov 04 2011
parent Steve Teale <steve.teale britseyeview.com> writes:
On Fri, 04 Nov 2011 12:13:34 +0100, Piotr Szturmaj wrote:
 
 Why not return a range, that will fetch next row when needed? popFront()
 will call socket's recv, parse and then store that row for the front()
 calls. This may be internally buffered to 100 rows or so.
 
 Example: https://github.com/pszturmaj/ddb/blob/master/postgres.d#L2145

I use a range in both cases, getting the whole result set or fetching a row at a time. It's just the whole result set case I'm wondering about. Steve
Nov 04 2011
prev sibling next sibling parent reply Dejan Lekic <dejan.lekic gmail.com> writes:
Steve Teale wrote:

 
 I also have a question. MySQL potentially produces result sets of up to
 64 bit size - rows. The MySQL client theoretically copes with these by
 storing a result set as a doubly linked list, and there's nothing to stop
 me doing the same. But is it necessary?

Yes it is. It is a common practice to get the whole thing rather than to take it in chunks. (Atomically so to speak) 1) The first thing that comes to mind is efficiency. 2) The second is - you remove the complexity (you will have to deal with various chunks). 3) The third reason, and the most important one, is the high possibility that your result set is going to change while you receive chunks. So, if you received it in chunks, it is possible that the overall data may be INCONSISTENT. So if you really wanted it to be done in chunks then you would have to make a snapshot of the result set on the database side, while behind the scenes those records keep being updated...
Nov 04 2011
next sibling parent reply Steve Teale <steve.teale britseyeview.com> writes:
On Fri, 04 Nov 2011 15:04:39 +0000, Dejan Lekic wrote:
 
 Yes it is. It is a common practice to get the whole thing rather than to
 take it in chunks. (Atomically so to speak) 1) The first thing that
 comes to mind is efficiency. 2) The second is - you remove the
 complexity (you will have to deal with various chunks).
 3) The third reason, and the most important one, is the high possibility
 that your result set is going to change while you receive chunks. So, if
 you received it in chunks, it is possible that the overall data may be
 INCONSISTENT. So if you really wanted it to be done in chunks then you
 would have to make a snapshot of the result set on the database side,
 while behind the scenes those records keep being updated...

Dejan, Don't get me wrong. I was not asking if it should be possible to get a complete result set, only if we needed to support sets with 2^^64 rows on a 32 bit system. Personally I think that is close to stupid - SQL is supposed to allow you to select the rows you want - not to make you do that on the client side. I will also support the row by row approach, because there are users who prefer that. With the MySQL protocol, I don't think you need to worry about changes while stepping through rows. It's very basic - the server sends the whole result set over the wire in response to a query. Whether you read them all first or deal with them one at a time is academic. You still have to read them all before you can do anything else on that thread. So what you get is a snapshot either way. Steve
Nov 04 2011
next sibling parent Dejan Lekic <dejan.lekic gmail.com> writes:
Steve Teale wrote:

 
 Don't get me wrong. I was not asking if it should be possible to get a
 complete result set, only if we needed to support sets with 2^^64 rows on
 a 32 bit system. Personally I think that is close to stupid - SQL is
 supposed to allow you to select the rows you want - not to make you do
 that on the client side.

It is all right - you asked for opinions - you got it. :) I am also writing a native D database driver, for the Ingres RDBMS. It is basically an implementation of the all Ingres DAMP protocols (lvl 1 to 7, the latest one). There you really need to make a cursor if you really want to do something you describe above. However, Ingres RDBMS will send you the result in messages (ie. chunks). But it really does not matter if it sends the result in chunks or if it streams the whole thing immediately (i would actually prefer Ingres does that, it would make some things easier). If I as a developer execute "SELECT * FROM few_bil_rows_table" I am *aware* of the consequence, but I still want *ALL ROWS*, because, say, I am doing an OLAP query, and I want to generate a huge report ASAP. :) Ask any BI expert, and they will tell you that 2^64 result set is nothing! :) Yes, hands down, if one deals with 2^64 result sets, that person will most likely *not* use MySQL/MariaDB but something more suitable, like Ingres (now Actian) VectorWise, and similar. If I wanted to do it in chunks, I would code that myself, and execute something like "SELECT * FROM few_bil_rows_table LIMIT offset, rowCount".
Nov 04 2011
prev sibling parent reply bls <bizprac orange.fr> writes:
On 11/04/2011 09:25 AM, Steve Teale wrote:
   I don't think you need to worry
 about changes while stepping through rows.

That's questionable. Consider Stock Market quotes. Some lib designers, for instance. libpgxx (C++ PostgreSQL) have a different view.. let me quote : *With conventional database APIs, you issue commands and queries to a database session or connection, and optionally create the occasional transaction. In libpqxx you start with a connection, but you do all your SQL work in transactions that you open in your connection. You commit each transaction when it's complete; if you don't, all changes made inside the transaction get rolled back.* However, there are always many ways to skin a cat. PostgreSQL, Firebird, MSSQL Server (not sure about MySQL) enable you to trigger server side events. Means the database server fires an update event (say employee deleted event) and a D function living in a secondary thread will catch this event and inform all interested parties (subscribers**) to refresh their RowSet. (and of course the UI) Beside, this is the way we create mission critical,db-centric realtime apps. works fine. My 2 cents. Bjoern ** Publisher/Subscriber pattern.
Nov 04 2011
next sibling parent bls <bizprac orange.fr> writes:
On 11/04/2011 12:05 PM, Steve Teale wrote:
 Your point is in fact in support of what I was saying. Why the hell allow
 the user to get 2^^64 rows when they're going to be out of date before
 they've been got together!

UI/RowSet just in case of relevance. Another point is that you can't load them (without chunking) into a table control/widget. So I would say, don't care about the 2^^64 row. how we use server side events.. in case that this is interesting.. A window subscribes to an server-side event publisher, The publisher informs all subscribers (Windows) The window clears and refills the row-set and finally display the data. closing the window means unregister from publisher. bla, bla Bjoern
Nov 04 2011
prev sibling next sibling parent bls <bizprac orange.fr> writes:
On 11/04/2011 03:00 PM, Sean Kelly wrote:
 nless something has changed in the last few years, the way to do this in MSSQL
is via a plugin in the SQL server itself.  Back in the days of yore this was
via extended stored procedures, which were loaded as a DLL by the SQL server,
but I think more recent iterations can call COM or .NET code or something like
that as well.  In short, the event notification isn't typically done via a
back-feed through a client connection.  Do any modern DBMSes have pub/sub
functionality like this?

F.I. PostgreSQL 1) CREATE OR REPLACE FUNCTION notify_demo() RETURNS TRIGGER AS ' BEGIN NOTIFY demoApp; RETURN NULL; END; 2) CREATE TRIGGER demo AFTER UPDATE ON users FOR EACH ROW EXECUTE PROCEDURE notify_demo(); 3) C# using System; using Npgsql; namespace TriggerDemo { class Program { static void Main(string[] args) { string connString = "Server=127.0.0.1;Port=5432;User Id=postgres;Password=pass;SyncNotification=true"; NpgsqlConnection conn = new NpgsqlConnection(connString); try { conn.Open(); NpgsqlCommand cmd = new NpgsqlCommand("listen demoApp;", conn); cmd.ExecuteNonQuery(); conn.Notification += new NotificationEventHandler(conn_Notification); Console.ReadLine(); conn.Close(); } catch (NpgsqlException ex) { Console.WriteLine(ex.Message); Console.ReadLine(); } } static void conn_Notification(object sender, NpgsqlNotificationEventArgs e) { Console.WriteLine("Row was updated"); } } } Firebird is very simular. I think MSSQL doesn't need a plugin. I have some test-code hanging around, the in case that it is more than just being curious, I can send you a copy. Guess it was SQLServer 2008, But the dotnet code is pretty complex. Regarding MySQL--Simply has no support for server side events. Bjoern
Nov 05 2011
prev sibling parent bls <bizprac orange.fr> writes:
On 11/05/2011 10:23 AM, Sean Kelly wrote:
   Is it possible to listen on multiple events with a single connection?

Sure, it is possible. see my code snippet.
Nov 05 2011
prev sibling next sibling parent Steve Teale <steve.teale britseyeview.com> writes:
On Fri, 04 Nov 2011 10:19:23 -0700, bls wrote:

 On 11/04/2011 09:25 AM, Steve Teale wrote:
   I don't think you need to worry
 about changes while stepping through rows.


As in don't worry about something over which you have no control!
 However, there are always many ways to skin a cat. PostgreSQL, Firebird,
 MSSQL Server (not sure about MySQL) enable you to trigger server side
 events.
 Means the database server fires an update event (say employee deleted
 event) and a D function living in a secondary thread will catch this
 event and inform all interested parties (subscribers**) to refresh their
 RowSet. (and of course the UI)
 
 Beside, this is the way we create mission critical,db-centric realtime
 apps. works fine.
 My 2 cents.

Bjoern There's not a hint of that in the MySQL protocol. It'a purely command/ response. The commands are: 0x00 COM_SLEEP (none, this is an internal thread state) 0x01 COM_QUIT mysql_close 0x02 COM_INIT_DB mysql_select_db 0x03 COM_QUERY mysql_real_query 0x04 COM_FIELD_LIST mysql_list_fields 0x05 COM_CREATE_DB mysql_create_db (deprecated) 0x06 COM_DROP_DB mysql_drop_db (deprecated) 0x07 COM_REFRESH mysql_refresh 0x08 COM_SHUTDOWN mysql_shutdown 0x09 COM_STATISTICS mysql_stat 0x0a COM_PROCESS_INFO mysql_list_processes 0x0b COM_CONNECT (none, this is an internal thread state) 0x0c COM_PROCESS_KILL mysql_kill 0x0d COM_DEBUG mysql_dump_debug_info 0x0e COM_PING mysql_ping 0x0f COM_TIME (none, this is an internal thread state) 0x10 COM_DELAYED_INSERT (none, this is an internal thread state) 0x11 COM_CHANGE_USER mysql_change_user 0x12 COM_BINLOG_DUMP sent by the slave IO thread to request a binlog 0x13 COM_TABLE_DUMP LOAD TABLE ... FROM MASTER (deprecated) 0x14 COM_CONNECT_OUT (none, this is an internal thread state) 0x15 COM_REGISTER_SLAVE sent by the slave to register ...) 0x16 COM_STMT_PREPARE mysql_stmt_prepare 0x17 COM_STMT_EXECUTE mysql_stmt_execute 0x18 COM_STMT_SEND_LONG_DATA mysql_stmt_send_long_data 0x19 COM_STMT_CLOSE mysql_stmt_close 0x1a COM_STMT_RESET mysql_stmt_reset 0x1b COM_SET_OPTION mysql_set_server_option 0x1c COM_STMT_FETCH mysql_stmt_fetch I don't see a 'let me know when something happens' command here. There is also no trace of it in the C API. Maybe it's unsophisticated, but it's popular, and that's the beast I'm dealing with. Your point is in fact in support of what I was saying. Why the hell allow the user to get 2^^64 rows when they're going to be out of date before they've been got together! Steve
Nov 04 2011
prev sibling next sibling parent Steve Teale <steve.teale britseyeview.com> writes:
On Fri, 04 Nov 2011 16:58:23 +0000, Dejan Lekic wrote:
 
 It is all right - you asked for opinions - you got it. :)
 
 If I wanted to do it in chunks, I would code that myself, and execute
 something like "SELECT * FROM few_bil_rows_table LIMIT offset,
 rowCount".

That's close to my point. Someone wanting the whole shebang would not have any particular difficulty in using the interfaces I'm proposing, without explicitly asking for chunks. The server sends the whole thing, all I'm doing for the complete result set case is to aggregate all - or some restricted number - of the rows. I don't see any efficiency penalty for a user who chooses to aggregate the same incoming stream of rows in some other way. There's no work for her to do - they're already decoded. But what is going to be the most common aggregation requirement? I'm thinking it is a size_t array of rows. Steve
Nov 04 2011
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 4, 2011, at 12:05 PM, Steve Teale wrote:

 On Fri, 04 Nov 2011 10:19:23 -0700, bls wrote:
=20
 On 11/04/2011 09:25 AM, Steve Teale wrote:
  I don't think you need to worry
 about changes while stepping through rows.


As in don't worry about something over which you have no control! =20
 However, there are always many ways to skin a cat. PostgreSQL, =


 MSSQL Server (not sure about MySQL) enable you to trigger server side
 events.
 Means the database server fires an update event (say employee deleted
 event) and a D function living in a secondary thread will catch this
 event and inform all interested parties (subscribers**) to refresh =


 RowSet. (and of course the UI)
=20
 Beside, this is the way we create mission critical,db-centric =


 apps. works fine.
 My 2 cents.

Bjoern =20 There's not a hint of that in the MySQL protocol. It'a purely command/ response. The commands are:

 I don't see a 'let me know when something happens' command here. There =

 also no trace of it in the C API.

Unless something has changed in the last few years, the way to do this = in MSSQL is via a plugin in the SQL server itself. Back in the days of = yore this was via extended stored procedures, which were loaded as a DLL = by the SQL server, but I think more recent iterations can call COM or = .NET code or something like that as well. In short, the event = notification isn't typically done via a back-feed through a client = connection. Do any modern DBMSes have pub/sub functionality like this?=
Nov 04 2011
prev sibling next sibling parent Dejan Lekic <dejan.lekic gmail.com> writes:
On Fri, 04 Nov 2011 15:00:17 -0700, Sean Kelly wrote:

 notification isn't typically done via a back-feed through a client
 connection.  Do any modern DBMSes have pub/sub functionality like this?

Ingres, for an instance, has the DBEVENT, and it is possible to create/ handle it via JDBC, ODBC, .NET ... http://docs.actian.com/ingres/10.0/sql-reference-guide/1835-create- dbevent-syntax
Nov 05 2011
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 5, 2011, at 4:29 AM, Dejan Lekic wrote:

 On Fri, 04 Nov 2011 15:00:17 -0700, Sean Kelly wrote:
=20
 notification isn't typically done via a back-feed through a client
 connection.  Do any modern DBMSes have pub/sub functionality like =


=20
 Ingres, for an instance, has the DBEVENT, and it is possible to =

 handle it via JDBC, ODBC, .NET =85

Sounds like I've been out of the DB loop for too long. Is it possible = to listen on multiple events with a single connection?=
Nov 05 2011
prev sibling next sibling parent reply Somedude <lovelydear mailmetrash.com> writes:
Le 04/11/2011 06:15, Steve Teale a écrit :
 I'm getting pretty close to having a reasonably comprehensive native D 
 driver for MySQL - no translated header files, no linking to libraries.
 ...
 Steve

A few simple questions/requests: 1. Does your library allow to get the resultset by batch of N rows ? It's a common requirement for databases that don't have the equivalent of the LIMIT clause of MySQL like Oracle. In those DBs, one has to resort to cursors to realistically retrieve large resultsets, or to write yet again the piece of code to send several successive queries. 2. would it be possible to automatically map a resultset to an object, maybe without having to resort to a full-fledged ORM, in the style of MyBatis (but without XML, preferably) ? 3. While we are on the subject of DB driver, did anyone mention the need for managing a connexion pool ? Thx Dude
Nov 05 2011
parent reply Steve Teale <steve.teale britseyeview.com> writes:
On Sat, 05 Nov 2011 09:34:34 +0100, Somedude wrote:
 
 A few simple questions/requests:
 1. Does your library allow to get the resultset by batch of N rows ?
 It's a common requirement for databases that don't have the equivalent
 of the LIMIT clause of MySQL like Oracle. In those DBs, one has to
 resort to cursors to realistically retrieve large resultsets, or to
 write yet again the piece of code to send several successive queries.
 
 2. would it be possible to automatically map a resultset to an object,
 maybe without having to resort to a full-fledged ORM, in the style of
 MyBatis (but without XML, preferably) ?
 
 3. While we are on the subject of DB driver, did anyone mention the need
 for managing a connexion pool ?
 

note have the LIMIT clause so I had no need to do anything special. I have provided the capability to map rows to a struct. I guess it would be possible to do the same with a class provided that all the class data members corresponded to columns, or that a subset of them existed as a struct with that limitation, or could be identified as candidates in some way. Connection pooling is on my list of things to do. Steve.
Nov 05 2011
parent Somedude <lovelydear mailmetrash.com> writes:
Le 05/11/2011 17:37, Steve Teale a écrit :
 On Sat, 05 Nov 2011 09:34:34 +0100, Somedude wrote:
 What I'm working on at the moment is the MySQL driver. That does as you 
 note have the LIMIT clause so I had no need to do anything special.
 
 I have provided the capability to map rows to a struct. I guess it would 
 be possible to do the same with a class provided that all the class data 
 members corresponded to columns, or that a subset of them existed as a 
 struct with that limitation, or could be identified as candidates in some 
 way.
 
 Connection pooling is on my list of things to do.
 
 Steve.

Thank you. Dude
Nov 05 2011
prev sibling parent reply Steve Teale <steve.teale britseyeview.com> writes:
I have pushed a first cut of the native MySQL driver to

https://github.com/britseye

The documentation can be seen at

britseyeview.com/software/mysqln/

I'm now investigating ODBC. That should be a fairly quick exercise, since 
I have already done ODBCD once, and I'm just doing a sketch at this 
point. I expect my conclusion to be that an almost identical interface is 
feasible. The licensing situation for ODBC on Linux appears to be OK. I 
can't speak for Windows - any thoughts? My primary target for an ODBC 
implementation is MS SQL Server. That could be done using the TDS 
protocol, but since FreeTDS have already been there and is LGPL, and we 
only need to link, I don't really want to go there.

I can see from Piotr's work on PostgreSGL that an almost identical 
interface is possible for that too. The license there looks friendly too, 
but again comments please. SQLite is a bit different by its nature, but I 
think that it can closely fit the pattern too - no license problems there 
I think.

I will write up a description of the interface I'm talking about in the 
next few days. This is what I see as the mid level interface, a set of 
modules - etc.mysql, etc.odbc, etc.postgres, etc.sqlite ...

It needs to be discussed and at some point agreed so that the current 
'team' can work toward it for the various databases.

We've recently seen an example of an SQLite interface that demonstrated a 
higher level style where some of the burden of getting SQL right was 
dealt with by the API. I detected some interest in it. Any further 
opinions on that? The water gets quite deep quite quickly if you go in 
that direction though, and think about how more complex SQL commands 
might be represented, the diversity of SQL dialects, and so on. Somewhere 
out there I can see etc.postgres.traits and such.

Sorry if this is rather long.

Steve
Nov 10 2011
parent Paulo Pinto <pjmlp progtools.org> writes:
You should not have any problems with ODBC on Windows regarding
licensing, I imagine.

After all you just need to code against the Win32 ODBC API. You
don't need to consider the real driver you are talking with.

--
Paulo

On 10.11.2011 18:54, Steve Teale wrote:
 I have pushed a first cut of the native MySQL driver to

 https://github.com/britseye

 The documentation can be seen at

 britseyeview.com/software/mysqln/

 I'm now investigating ODBC. That should be a fairly quick exercise, since
 I have already done ODBCD once, and I'm just doing a sketch at this
 point. I expect my conclusion to be that an almost identical interface is
 feasible. The licensing situation for ODBC on Linux appears to be OK. I
 can't speak for Windows - any thoughts? My primary target for an ODBC
 implementation is MS SQL Server. That could be done using the TDS
 protocol, but since FreeTDS have already been there and is LGPL, and we
 only need to link, I don't really want to go there.

 I can see from Piotr's work on PostgreSGL that an almost identical
 interface is possible for that too. The license there looks friendly too,
 but again comments please. SQLite is a bit different by its nature, but I
 think that it can closely fit the pattern too - no license problems there
 I think.

 I will write up a description of the interface I'm talking about in the
 next few days. This is what I see as the mid level interface, a set of
 modules - etc.mysql, etc.odbc, etc.postgres, etc.sqlite ...

 It needs to be discussed and at some point agreed so that the current
 'team' can work toward it for the various databases.

 We've recently seen an example of an SQLite interface that demonstrated a
 higher level style where some of the burden of getting SQL right was
 dealt with by the API. I detected some interest in it. Any further
 opinions on that? The water gets quite deep quite quickly if you go in
 that direction though, and think about how more complex SQL commands
 might be represented, the diversity of SQL dialects, and so on. Somewhere
 out there I can see etc.postgres.traits and such.

 Sorry if this is rather long.

 Steve

Nov 11 2011