www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - What is best way to read and interpret binary files?

reply welkam <wwwelkam gmail.com> writes:
So my question is in subject/title. I want to parse binary file 
into D structs and cant really find any good way of doing it. 
What I try to do now is something like this

byte[4] fake_integer;
auto fd = File("binary.data", "r");
fd.rawRead(fake_integer);
int real_integer = *(cast(int*)  fake_integer.ptr);

What I ideally want is to have some kind of c style array and 
just cast it into struct or take existing struct and populate 
fields one by one with data from file. Is there a D way of doing 
it or should I call core.stdc.stdio functions instead?
Nov 19 2018
parent reply Neia Neutuladh <neia ikeran.org> writes:
On Mon, 19 Nov 2018 21:30:36 +0000, welkam wrote:
 So my question is in subject/title. I want to parse binary file into D
 structs and cant really find any good way of doing it. What I try to do
 now is something like this
 
 byte[4] fake_integer;
 auto fd = File("binary.data", "r");
 fd.rawRead(fake_integer);
 int real_integer = *(cast(int*)  fake_integer.ptr);
 
 What I ideally want is to have some kind of c style array and just cast
 it into struct or take existing struct and populate fields one by one
 with data from file. Is there a D way of doing it or should I call
 core.stdc.stdio functions instead?
Nothing stops you from writing: SomeStruct myStruct; fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]); Standard caveats about byte order and alignment.
Nov 19 2018
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Nov 19, 2018 at 10:14:25PM +0000, Neia Neutuladh via
Digitalmars-d-learn wrote:
 On Mon, 19 Nov 2018 21:30:36 +0000, welkam wrote:
 So my question is in subject/title. I want to parse binary file into D
 structs and cant really find any good way of doing it. What I try to do
 now is something like this
 
 byte[4] fake_integer;
 auto fd = File("binary.data", "r");
 fd.rawRead(fake_integer);
 int real_integer = *(cast(int*)  fake_integer.ptr);
 
 What I ideally want is to have some kind of c style array and just cast
 it into struct or take existing struct and populate fields one by one
 with data from file. Is there a D way of doing it or should I call
 core.stdc.stdio functions instead?
Nothing stops you from writing: SomeStruct myStruct; fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);
Actually, the case is unnecessary, because arrays implicitly convert to void[], and pointers are sliceable. So all you need is: SomeStruct myStruct; fd.rawRead((&myStruct)[0 .. 1]); This works for all POD types. Writing the struct out to file is the same thing: SomeStruct myStruct; fd.rawWrite((&myStruct)[0 .. 1]); with the nice symmetry that you just have to rename rawRead to rawWrite. For arrays: SomeStruct[] arr; fd.rawWrite(arr); ... arr.length = ... /* expected length */ fd.rawRead(arr); To correctly store length information, you'll have to manually write out array lengths as well, and read it before reading the array. Should be straightforward to figure out.
 Standard caveats about byte order and alignment.
Alignment shouldn't be a problem, since local variables should already be properly aligned. Endianness, however, will be a problem if you intend to transport this data to/from a different platform / hardware. You'll need to manually fix the endianness yourself. T -- This is not a sentence.
Nov 19 2018
next sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Mon, 19 Nov 2018 14:32:55 -0800, H. S. Teoh wrote:
 Standard caveats about byte order and alignment.
Alignment shouldn't be a problem, since local variables should already be properly aligned.
Right, and the IO layer probably doesn't need to read to aligned memory anyway. Struct fields, however, need to have the same relative alignment as the file.
Nov 19 2018
prev sibling parent reply mw <mingwu gmail.com> writes:
On Monday, 19 November 2018 at 22:32:55 UTC, H. S. Teoh wrote:
 Actually, the case is unnecessary, because arrays implicitly 
 convert to void[], and pointers are sliceable.  So all you need 
 is:

 	SomeStruct myStruct;
 	fd.rawRead((&myStruct)[0 .. 1]);

 This works for all POD types.

 Writing the struct out to file is the same thing:

 	SomeStruct myStruct;
 	fd.rawWrite((&myStruct)[0 .. 1]);
This works, but I'm just wondering why we do not just add more functions to the library: rawRead(ref T t), and rawWrite(ref T t) to read & write single value.
 For arrays:

 	SomeStruct[] arr;
 	fd.rawWrite(arr);
 	...

 	arr.length = ... /* expected length */
 	fd.rawRead(arr);
Currently, the library only have this two functions for arrays.
Mar 29 2021
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Mar 30, 2021 at 12:32:36AM +0000, mw via Digitalmars-d-learn wrote:
 On Monday, 19 November 2018 at 22:32:55 UTC, H. S. Teoh wrote:
 Actually, the case is unnecessary, because arrays implicitly convert
 to void[], and pointers are sliceable.  So all you need is:
 
 	SomeStruct myStruct;
 	fd.rawRead((&myStruct)[0 .. 1]);
 
 This works for all POD types.
 
 Writing the struct out to file is the same thing:
 
 	SomeStruct myStruct;
 	fd.rawWrite((&myStruct)[0 .. 1]);
This works, but I'm just wondering why we do not just add more functions to the library: rawRead(ref T t), and rawWrite(ref T t) to read & write single value.
If you wish, submit a PR for this. It's not hard to write your own overloads for it, though: void rawWrite(File f, ref T t) trusted { f.rawWrite((cast(ubyte*) &t)[0 .. T.sizeof]); } // ditto for rawRead T -- A linguistics professor was lecturing to his class one day. "In English," he said, "A double negative forms a positive. In some languages, though, such as Russian, a double negative is still a negative. However, there is no language wherein a double positive can form a negative." A voice from the back of the room piped up, "Yeah, yeah."
Mar 29 2021
prev sibling parent reply welkam <wwwelkam gmail.com> writes:
On Monday, 19 November 2018 at 22:14:25 UTC, Neia Neutuladh wrote:
 Nothing stops you from writing:

     SomeStruct myStruct;
     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);

 Standard caveats about byte order and alignment.
Never would I thought about casting struct to static array. If I understood correctly you cast myStruct pointer to ubyte pointer and then construct static array on stack with tmpArray.ptr = (ubyte pointer) and tmpArray.sizeof = SomeStruct.sizeof What I figured out when I woke up is that I never needed c style arrays. What I could do is to allocate enough data for all file in ubyte array and just use slices to read data by chunks and cast them into necessary structs. Thanks Neia Neutuladh and H. S. Teoh for giving me some pointers https://www.explainxkcd.com/wiki/index.php/138:_Pointers
Nov 20 2018
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Tuesday, 20 November 2018 at 11:54:59 UTC, welkam wrote:
 On Monday, 19 November 2018 at 22:14:25 UTC, Neia Neutuladh 
 wrote:
 Nothing stops you from writing:

     SomeStruct myStruct;
     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);

 Standard caveats about byte order and alignment.
Never would I thought about casting struct to static array. If I understood correctly you cast myStruct pointer to ubyte pointer and then construct static array on stack with tmpArray.ptr = (ubyte pointer) and tmpArray.sizeof = SomeStruct.sizeof
Almost correct, except it's not a static array, it's just a slice, i.e. ubyte[].
Nov 20 2018
parent welkam <wwwelkam gmail.com> writes:
On Tuesday, 20 November 2018 at 12:01:49 UTC, Stanislav Blinov 
wrote:
 On Tuesday, 20 November 2018 at 11:54:59 UTC, welkam wrote:
 On Monday, 19 November 2018 at 22:14:25 UTC, Neia Neutuladh 
 wrote:
 Nothing stops you from writing:

     SomeStruct myStruct;
     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);

 Standard caveats about byte order and alignment.
Never would I thought about casting struct to static array. If I understood correctly you cast myStruct pointer to ubyte pointer and then construct static array on stack with tmpArray.ptr = (ubyte pointer) and tmpArray.sizeof = SomeStruct.sizeof
Almost correct, except it's not a static array, it's just a slice, i.e. ubyte[].
I guess it came from inseparability with C where you want to slice C arrays? Thats useful to know
Nov 20 2018