www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to read a C++ class from file into memory

reply David Finlayson <david.p.finlayson gmail.com> writes:
I am coming from Python to D, so forgive my limited C/C++ knowledge.

What is the idiomatic way to read a heterogeneous binary structure in D? 

In my C++ book, it shows examples of defining a class or struct with the
appropriate types and then passing a pointer to this class to fread().

However, in Java or Python I could just read the types directly from a binary
stream (including the padding bytes associated with the structure on disk).

How should I do this in D?

I did see this post:

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=6071

Note that I ultimately want to store these data back into classes where I can
work with it.

Thanks,

David
Mar 22 2007
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
David Finlayson wrote:
 I am coming from Python to D, so forgive my limited C/C++ knowledge.
Don't worry; I'd be inclined to think it's a good thing :3
 What is the idiomatic way to read a heterogeneous binary structure in D? 
 
 In my C++ book, it shows examples of defining a class or struct with the
appropriate types and then passing a pointer to this class to fread().
For my money, that's a bad idea because the binary representation of a struct or object isn't necessarily the same on different machines or even same machine, different operating system. Quick example: real is a different size on Windows to Linux (IIRC).
 However, in Java or Python I could just read the types directly from a binary
stream (including the padding bytes associated with the structure on disk).
 
 How should I do this in D?
 
 I did see this post:
 
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=6071
 
 Note that I ultimately want to store these data back into classes where I can
work with it.
 
 Thanks,
 
 David
The nice thing about Python is the pickle protocol. That's what I assume you were using. Since Python can interactively inspect objects to find out what data is attached to them, this is really easy. It's also nice because pickle isn't blind: it will serialise things in a predictable format based on type, not on in-memory layout. So you can dump a bunch of Python objects to a file, send it to another machine, and read them back out again. In D, we're kinda-sorta there. The way I'm solving this is using the .tupleof property of structures. For example: struct Point { double x, y; } { Point pt; foreach( member ; pt.tupleof ) member = 0.0; } At which point, pt.x = pt.y = 0. Combining this with templated functions lets you write out or read in any structure you please. Note: I haven't tried *any* of this with classes, because my feeling is that classes are often much more complex than structures (which are just plain old data, clumped together), plus they're far more likely to have references to other stuff; then what do you do? I thought about including some code from my serialisation library, but it tends to be "all or nothing". Plus, this was written for a research project, and I'm not sure who *actually* owns the code (me or the uni) X_X. Anyway, hope this helps :) -- Daniel -- int getRandomNumber() { return 4; // chosen by fair dice roll. // guaranteed to be random. } http://xkcd.com/ v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 22 2007
parent reply David Finlayson <david.p.finlayson gmail.com> writes:
Daniel Keep Wrote:

 For my money, that's a bad idea because the binary representation of a
 struct or object isn't necessarily the same on different machines or
 even same machine, different operating system.
 
Agreed, however, the file is a binary storage format for a sonar system. I have been given enough code snippets from the company to read the file (and I have a working Python version), what I want to do now is convert this code to D. Ultimately, it is my problem to conform to their file format. I don't even know the byte alignment. However, I do have a working Python prototype. (I guess if I thought about it, I could figure the alignment out now).
 The nice thing about Python is the pickle protocol.  That's what I
 assume you were using.  Since Python can interactively inspect objects
 to find out what data is attached to them, this is really easy.  It's
 also nice because pickle isn't blind: it will serialise things in a
 predictable format based on type, not on in-memory layout.
 
In my case, I used Python's struct.unpack module to build a reader for each of the classes, structs and unions (yes, they used all three types). It took me a while, but I was able to identify where the padding bytes were placed to fill out the structures on disk. So, I understand exactly how the file is stored on disk. All I need to do is learn how to read the file efficiently in D. Thanks for answering my post. Do you know how I might use std.stream to read these files? David
Mar 22 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"David Finlayson" <david.p.finlayson gmail.com> wrote in message 
news:etvlr7$1a9p$1 digitalmars.com...

 Thanks for answering my post. Do you know how I might use std.stream to 
 read these files?
If you exactly how the data is structured, you may be able to define several structures which define the layout of the data, e.g. align(1) struct Header { uint magic; uint version; char[100] comments; } Or something along those lines, and then read it in with readExact: Stream s = ... Header h; s.readExact(&h, Header.sizeof); If the format is more complex, it'll probably take a bit more work, but that's the general idea.
Mar 22 2007
parent reply David Finlayson <david.p.finlayson gmail.com> writes:
Jarrett Billingsley Wrote:

 "David Finlayson" <david.p.finlayson gmail.com> wrote in message 
 news:etvlr7$1a9p$1 digitalmars.com...
 
 Thanks for answering my post. Do you know how I might use std.stream to 
 read these files?
If you exactly how the data is structured, you may be able to define several structures which define the layout of the data, e.g. align(1) struct Header { uint magic; uint version; char[100] comments; } Or something along those lines, and then read it in with readExact: Stream s = ... Header h; s.readExact(&h, Header.sizeof); If the format is more complex, it'll probably take a bit more work, but that's the general idea.
OK, this is what I want. Question: Header h creates a structure of type Header. Is h a pointer? It looks like you dereferenced it with &h in readExact(). I really don't understand how D uses pointers yet.
 
Mar 22 2007
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
David Finlayson wrote:
 Jarrett Billingsley Wrote:
 
 "David Finlayson" <david.p.finlayson gmail.com> wrote in message 
 news:etvlr7$1a9p$1 digitalmars.com...

 Thanks for answering my post. Do you know how I might use std.stream to 
 read these files?
If you exactly how the data is structured, you may be able to define several structures which define the layout of the data, e.g. align(1) struct Header { uint magic; uint version; char[100] comments; } Or something along those lines, and then read it in with readExact: Stream s = ... Header h; s.readExact(&h, Header.sizeof); If the format is more complex, it'll probably take a bit more work, but that's the general idea.
OK, this is what I want. Question: Header h creates a structure of type Header. Is h a pointer? It looks like you dereferenced it with &h in readExact(). I really don't understand how D uses pointers yet.
No, structs in D are POD: Plain Old Data. &h is taking the address of h. What readExact does is it takes a pointer, and a length, and reads exactly that many bytes, and puts them at that pointer. &h works out *where* h is being stored (so readExact can write to it), and Header.sizeof tells it how many bytes to read. -- Daniel -- int getRandomNumber() { return 4; // chosen by fair dice roll. // guaranteed to be random. } http://xkcd.com/ v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 22 2007
parent David Finlayson <david.p.finlayson gmail.com> writes:
Thanks Dan and Jerret -

Using readExact() to load a struct worked perfectly.
Mar 23 2007
prev sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"David Finlayson" <david.p.finlayson gmail.com> wrote in message 
news:etuh4g$2ijp$1 digitalmars.com...
I am coming from Python to D, so forgive my limited C/C++ knowledge.

 What is the idiomatic way to read a heterogeneous binary structure in D?

 In my C++ book, it shows examples of defining a class or struct with the 
 appropriate types and then passing a pointer to this class to fread().

 However, in Java or Python I could just read the types directly from a 
 binary stream (including the padding bytes associated with the structure 
 on disk).

 How should I do this in D?

 I did see this post:

 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=6071

 Note that I ultimately want to store these data back into classes where I 
 can work with it.
If you haven't got too many classes/structures to serialize, I've attached a module with a simple serialization/deserialization mechanism that works with std.stream. It will automatically serialize out all primitive and array types, as well as structures which have no unions. You can specify custom serialization and deserialization methods for classes and structures, and you can make structures behave as though they were an opaque chunk of data (for performance when reading/writing). It's very easy to use; to serialize anything, you just write: Serialize(stream, data); And to deserialize it again: Deserialize(stream, data); Defining the custom methods for classes and structs is easy. The serialize function should just be declared as "void serialize(Stream s)" and the deserialize function as "static T deserialize(Stream s)", where T is the type for which you're defining the deserialize function. I extracted this code from a larger module, and I think it has everything it needs to work, if it doesn't let me know! begin 666 cereal.d` end
Mar 22 2007
parent reply David Finlayson <david.p.finlayson gmail.com> writes:
For the moment, I just want to understand the key part of your Deserialize
function.

The secret sauce is in this line (and others like it): 

s.readExact(strptr, char.sizeof * len)

where s is a file stream. It looks like with this method, it is only possible
to read a single variable or an array of the same type (as you are doing here).
Is it possible to send a pointer to a struct and read THAT in from the stream?
If so, how does it handle padding bytes? I know there is an align() attribute
for structs that might apply here.
Mar 22 2007
parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"David Finlayson" <david.p.finlayson gmail.com> wrote in message 
news:etvmlg$1auh$1 digitalmars.com...
 For the moment, I just want to understand the key part of your Deserialize 
 function.

 The secret sauce is in this line (and others like it):

 s.readExact(strptr, char.sizeof * len)

 where s is a file stream. It looks like with this method, it is only 
 possible to read a single variable or an array of the same type (as you 
 are doing here). Is it possible to send a pointer to a struct and read 
 THAT in from the stream? If so, how does it handle padding bytes? I know 
 there is an align() attribute for structs that might apply here.
Yes, as you've seen in the other post, you can do that. As for alignment issues, that's up to your structure to know the layout of your data. Say you know the format is something like: Header structure: 0x0000 4 bytes: Magic number 0x0004 2 bytes: Version 0x0005 1 byte: Flags 0x0006 1 byte: (padding) 0x0008 24 bytes: Comments 0x0020 4 bytes: offset in file to some table 0x0023 4 bytes: (reserved) -------------------------- Total: 40 bytes So you could write your structure like this. Notice we use "align(1)" on the structure to signal to the compiler that the members should be packed in as tightly as possible. This way we have complete control over how the data is laid out. align(1) struct Header { uint magic; ushort version; ubyte flags; ubyte _padding1; char[24] comments; uint tableOffset; uint _reserved; } // For good measure static assert(Header.sizeof == 40); That static assert is there to make sure that the structure's size matches the calculated size of the header, and to make sure that we don't inadvertently change the header struct and mess things up. Of course, if the header is a variable length, that assert probably wouldn't be there.
Mar 23 2007