www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - HDF5 bindings for D

reply "Laeeth Isharc" <laeethnospam spammenot_laeeth.com> writes:
https://github.com/Laeeth/d_hdf5

HDF5 is a very valuable tool for those working with large data 
sets.

 From HDF5group.org

HDF5 is a unique technology suite that makes possible the 
management of extremely large and complex data collections. The 
HDF5 technology suite includes:

* A versatile data model that can represent very complex data 
objects and a wide variety of metadata.
* A completely portable file format with no limit on the number 
or size of data objects in the collection.
* A software library that runs on a range of computational 
platforms, from laptops to massively parallel systems, and 
implements a high-level API with C, C++, Fortran 90, and Java 
interfaces.
* A rich set of integrated performance features that allow for 
access time and storage space optimizations.
* Tools and applications for managing, manipulating, viewing, and 
analyzing the data in the collection.
* The HDF5 data model, file format, API, library, and tools are 
open and distributed without charge.

 From h5py.org:
[HDF5] lets you store huge amounts of numerical data, and easily 
manipulate that data from NumPy. For example, you can slice into 
multi-terabyte datasets stored on disk, as if they were real 
NumPy arrays. Thousands of datasets can be stored in a single 
file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like 
dictionary and NumPy array syntax. For example, you can iterate 
over datasets in a file, or check out the .shape or .dtype 
attributes of datasets. You don't need to know anything special 
about HDF5 to get started.

In addition to the easy-to-use high level interface, h5py rests 
on a object-oriented Cython wrapping of the HDF5 C API. Almost 
anything you can do from C in HDF5, you can do from h5py.

Best of all, the files you create are in a widely-used standard 
binary format, which you can exchange with other people, 
including those who use programs like IDL and MATLAB.

===========
As far as I know there has not really been a complete set of HDF5 
bindings for D yet.

Bindings should have three levels:
1. pure C API declaration
2. 'nice' D wrapper around C API (eg that knows about strings, 
not just char*)
3. idiomatic D interface that uses CTFE/templates

I borrowed Stefan Frijter's work on (1) above to get started.  I 
cannot keep track of things when split over too many source 
files, so I put everything in one file - hdf5.d.

Have implemented a basic version of 2.  Includes throwOnError 
rather than forcing checking status C style, but the exception 
code is not very good/complete (time + lack of experience with D 
exceptions).

(3) will have to come later.

It's more or less complete, and the examples I have translated so 
far mostly work.  But still a work in progress.  Any 
help/suggestions appreciated.  [I am doing this for myself, so 
project is not as pretty as I would like in an ideal world].


https://github.com/Laeeth/d_hdf5
Dec 21 2014
next sibling parent reply Rikki Cattermole <alphaglosined gmail.com> writes:
On 22/12/2014 5:51 p.m., Laeeth Isharc wrote:
 https://github.com/Laeeth/d_hdf5

 HDF5 is a very valuable tool for those working with large data sets.

  From HDF5group.org

 HDF5 is a unique technology suite that makes possible the management of
 extremely large and complex data collections. The HDF5 technology suite
 includes:

 * A versatile data model that can represent very complex data objects
 and a wide variety of metadata.
 * A completely portable file format with no limit on the number or size
 of data objects in the collection.
 * A software library that runs on a range of computational platforms,
 from laptops to massively parallel systems, and implements a high-level
 API with C, C++, Fortran 90, and Java interfaces.
 * A rich set of integrated performance features that allow for access
 time and storage space optimizations.
 * Tools and applications for managing, manipulating, viewing, and
 analyzing the data in the collection.
 * The HDF5 data model, file format, API, library, and tools are open and
 distributed without charge.

  From h5py.org:
 [HDF5] lets you store huge amounts of numerical data, and easily
 manipulate that data from NumPy. For example, you can slice into
 multi-terabyte datasets stored on disk, as if they were real NumPy
 arrays. Thousands of datasets can be stored in a single file,
 categorized and tagged however you want.

 H5py uses straightforward NumPy and Python metaphors, like dictionary
 and NumPy array syntax. For example, you can iterate over datasets in a
 file, or check out the .shape or .dtype attributes of datasets. You
 don't need to know anything special about HDF5 to get started.

 In addition to the easy-to-use high level interface, h5py rests on a
 object-oriented Cython wrapping of the HDF5 C API. Almost anything you
 can do from C in HDF5, you can do from h5py.

 Best of all, the files you create are in a widely-used standard binary
 format, which you can exchange with other people, including those who
 use programs like IDL and MATLAB.

 ===========
 As far as I know there has not really been a complete set of HDF5
 bindings for D yet.

 Bindings should have three levels:
 1. pure C API declaration
 2. 'nice' D wrapper around C API (eg that knows about strings, not just
 char*)
 3. idiomatic D interface that uses CTFE/templates

 I borrowed Stefan Frijter's work on (1) above to get started.  I cannot
 keep track of things when split over too many source files, so I put
 everything in one file - hdf5.d.

 Have implemented a basic version of 2.  Includes throwOnError rather
 than forcing checking status C style, but the exception code is not very
 good/complete (time + lack of experience with D exceptions).

 (3) will have to come later.

 It's more or less complete, and the examples I have translated so far
 mostly work.  But still a work in progress.  Any help/suggestions
 appreciated.  [I am doing this for myself, so project is not as pretty
 as I would like in an ideal world].


 https://github.com/Laeeth/d_hdf5
You seem to be missing your dub file. Would be rather hard to get it onto dub repository without it ;) Oh and keep the bindings separate from wrappers in terms of subpackages.
Dec 21 2014
parent "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
On Monday, 22 December 2014 at 05:04:10 UTC, Rikki Cattermole 
wrote:
 You seem to be missing your dub file. Would be rather hard to 
 get it onto dub repository without it ;)
 Oh and keep the bindings separate from wrappers in terms of 
 subpackages.
Thanks - added now. Will work on separating out bindings when have a bit more time, but it should be easy enough.
Dec 22 2014
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 22 December 2014 at 04:51:44 UTC, Laeeth Isharc wrote:
 https://github.com/Laeeth/d_hdf5

 HDF5 is a very valuable tool for those working with large data 
 sets.

 From HDF5group.org

 HDF5 is a unique technology suite that makes possible the 
 management of extremely large and complex data collections. The 
 HDF5 technology suite includes:

 * A versatile data model that can represent very complex data 
 objects and a wide variety of metadata.
 * A completely portable file format with no limit on the number 
 or size of data objects in the collection.
 * A software library that runs on a range of computational 
 platforms, from laptops to massively parallel systems, and 
 implements a high-level API with C, C++, Fortran 90, and Java 
 interfaces.
 * A rich set of integrated performance features that allow for 
 access time and storage space optimizations.
 * Tools and applications for managing, manipulating, viewing, 
 and analyzing the data in the collection.
 * The HDF5 data model, file format, API, library, and tools are 
 open and distributed without charge.

 From h5py.org:
 [HDF5] lets you store huge amounts of numerical data, and 
 easily manipulate that data from NumPy. For example, you can 
 slice into multi-terabyte datasets stored on disk, as if they 
 were real NumPy arrays. Thousands of datasets can be stored in 
 a single file, categorized and tagged however you want.

 H5py uses straightforward NumPy and Python metaphors, like 
 dictionary and NumPy array syntax. For example, you can iterate 
 over datasets in a file, or check out the .shape or .dtype 
 attributes of datasets. You don't need to know anything special 
 about HDF5 to get started.

 In addition to the easy-to-use high level interface, h5py rests 
 on a object-oriented Cython wrapping of the HDF5 C API. Almost 
 anything you can do from C in HDF5, you can do from h5py.

 Best of all, the files you create are in a widely-used standard 
 binary format, which you can exchange with other people, 
 including those who use programs like IDL and MATLAB.

 ===========
 As far as I know there has not really been a complete set of 
 HDF5 bindings for D yet.

 Bindings should have three levels:
 1. pure C API declaration
 2. 'nice' D wrapper around C API (eg that knows about strings, 
 not just char*)
 3. idiomatic D interface that uses CTFE/templates

 I borrowed Stefan Frijter's work on (1) above to get started.  
 I cannot keep track of things when split over too many source 
 files, so I put everything in one file - hdf5.d.

 Have implemented a basic version of 2.  Includes throwOnError 
 rather than forcing checking status C style, but the exception 
 code is not very good/complete (time + lack of experience with 
 D exceptions).

 (3) will have to come later.

 It's more or less complete, and the examples I have translated 
 so far mostly work.  But still a work in progress.  Any 
 help/suggestions appreciated.  [I am doing this for myself, so 
 project is not as pretty as I would like in an ideal world].


 https://github.com/Laeeth/d_hdf5
Also relevant to some: http://code.dlang.org/packages/netcdf
Dec 22 2014