digitalmars.D.learn - Binary I/O for Newbie
- "tjb" <broughtj gmail.com> Feb 27 2012
- "Robert Jacques" <sandford jhu.edu> Feb 27 2012
- Justin Whear <justin economicmodeling.com> Feb 27 2012
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> Feb 27 2012
- Tobias Brandt <tob.brandt googlemail.com> Feb 27 2012
- Justin Whear <justin economicmodeling.com> Feb 27 2012
- "tjb" <broughtj gmail.com> Feb 27 2012
- Tobias Brandt <tob.brandt googlemail.com> Feb 27 2012
- Tobias Brandt <tob.brandt googlemail.com> Feb 27 2012
- "tjb" <broughtj gmail.com> Feb 27 2012
- Tobias Brandt <tob.brandt googlemail.com> Feb 27 2012
All,
I am just starting to learn D. I am an economist - not a
programmer, so I appreciate your patience with lack of knowledge.
I have some financial data in a binary file that I would like to
process. In C++ I have the data in a structure like this:
struct TaqIdx {
char symbol[10];
int tdate;
int begrec;
int endrec;
}
And I use an ifstream to cast the data to the structure in read.
I'm struggling to get a handle on I/O in D. Can you give some
pointers? Thanks so much!
TJB
Feb 27 2012
On Mon, 27 Feb 2012 12:21:21 -0600, tjb <broughtj gmail.com> wrote:All, I am just starting to learn D. I am an economist - not a programmer, so I appreciate your patience with lack of knowledge. I have some financial data in a binary file that I would like to process. In C++ I have the data in a structure like this: struct TaqIdx { char symbol[10]; int tdate; int begrec; int endrec; } And I use an ifstream to cast the data to the structure in read. I'm struggling to get a handle on I/O in D. Can you give some pointers? Thanks so much! TJB
This is about the simplest way to read binary data in: auto data = cast(TaqIdx[]) std.file.read(filename);
Feb 27 2012
On Mon, 27 Feb 2012 19:21:21 +0100, tjb wrote:All, I am just starting to learn D. I am an economist - not a programmer, so I appreciate your patience with lack of knowledge. I have some financial data in a binary file that I would like to process. In C++ I have the data in a structure like this: struct TaqIdx { char symbol[10]; int tdate; int begrec; int endrec; } And I use an ifstream to cast the data to the structure in read. I'm struggling to get a handle on I/O in D. Can you give some pointers? Thanks so much! TJB
Check out std.stream (http://dlang.org/phobos/std_stream.html). I'd do something like this: auto input = new File("somefile"); TagIdx tag; input.readExact( &tag, TagIdx.sizeof ); If you get funky results, the file might be using a different endianness.
Feb 27 2012
On 02/27/2012 11:27 AM, Tobias Brandt wrote:So, something like this should work: [...]
It really depends on how you wrote the file originally. If you know that it is packed, i.e. 10+32+32+32=106 bytes per record, then yes.
You meant 4 bytes per int. :)If you wrote to the file with a C++ program, then I guess the compiler aligned the data so that the whole struct is 128 bytes in size. Technically, the C++ compiler is allowed to do anything short of changing the order of the struct fields.
That is correct for non-POD types. The C++ compiler must treat POD structs essentially as if they are C structs. Ali
Feb 27 2012
On 02/27/2012 11:43 AM, Tobias Brandt wrote:If you wrote to the file with a C++ program, then I guess the compiler aligned the data so that the whole struct is 128 bytes in size. Technically, the C++ compiler is allowed to do anything short of changing the order of the struct fields.
That is correct for non-POD types. The C++ compiler must treat POD structs essentially as if they are C structs.
Correct me if I'm wrong. But as far a I know the C standard also allows arbitrary alignment.
You were correct. I somehow misread "short of changing the order" as meaning "even changing the order". But even then I wasn't entirely correct. Just found this thread: http://stackoverflow.com/q/281045 C guarantees that the members are not reordered, but C++ allows reordering by "The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1)." Ali
Feb 27 2012
On 02/27/2012 10:21 AM, tjb wrote:I have some financial data in a binary file that I would like to process. In C++ I have the data in a structure like this: struct TaqIdx { char symbol[10]; int tdate; int begrec; int endrec; }
The equivalent of that C++ (and C) struct would be almost the same in D. Just replace char with 'ubyte' (or 'byte', if the 'char' were signed). Let me stress the fact that character types of D are UTF code units, not bytes. struct TaqIdx { ubyte[10] symbol; int tdate; int begrec; int endrec; } On the other hand, if symbol were indeed in ASCII, and will be treated as such in the program, then char[10] is fine too: char[10] symbol; Also, the int type is always 32 bits in D. Check whether that was the case for the system that the C++ TaqIdx was used on. You can read in binary by std.stdio.rawRead. It is mildly annoying that you still have to use an array even when reading a single TaqIdx: TaqIdx[1] taqs; file.rawRead(taqs); Then use taqs[0] if there is only one.Can you give some pointers?
Must... resist... :p Ali
Feb 27 2012
Doesn't the struct alignment play a role here?
Feb 27 2012
On Mon, 27 Feb 2012 19:42:36 +0100, Tobias Brandt wrote:Doesn't the struct alignment play a role here?
Good point. If the data is packed, you can toss an align(1) on the front of the struct declaration.
Feb 27 2012
On Monday, 27 February 2012 at 18:56:15 UTC, Justin Whear wrote:On Mon, 27 Feb 2012 19:42:36 +0100, Tobias Brandt wrote:Doesn't the struct alignment play a role here?
Good point. If the data is packed, you can toss an align(1) on the front of the struct declaration.
So, something like this should work: import std.stdio : writeln, writefln; import std.stream; align(1) struct TaqIdx { char[10] symbol; int tdate; int begrec; int endrec; } void main() { auto input = new File("T200808A.IDX"); TaqIdx taq; input.readExact(&taq, TaqIdx.sizeof); writefln("%s %s %s %s", taq.symbol, taq.tdate, taq.begrec, taq.endrec); } Thanks so much! TJB
Feb 27 2012
So, something like this should work: [...]
It really depends on how you wrote the file originally. If you know that it is packed, i.e. 10+32+32+32=106 bytes per record, then yes. If you wrote to the file with a C++ program, then I guess the compiler aligned the data so that the whole struct is 128 bytes in size. Technically, the C++ compiler is allowed to do anything short of changing the order of the struct fields. You could just let your C++ program print out sizeof(TaqIdx) or manually divide the file size by the number of records (if you know it) to make sure.
Feb 27 2012
It really depends on how you wrote the file originally. If you know that it is packed, i.e. 10+32+32+32=106 bytes per record, then yes.
You meant 4 bytes per int. :)
Yep, good catch.If you wrote to the file with a C++ program, then I guess the compiler aligned the data so that the whole struct is 128 bytes in size. Technically, the C++ compiler is allowed to do anything short of changing the order of the struct fields.
That is correct for non-POD types. The C++ compiler must treat POD structs essentially as if they are C structs.
Correct me if I'm wrong. But as far a I know the C standard also allows arbitrary alignment.
Feb 27 2012
On Monday, 27 February 2012 at 19:28:07 UTC, Tobias Brandt wrote:So, something like this should work: [...]
It really depends on how you wrote the file originally. If you know that it is packed, i.e. 10+32+32+32=106 bytes per record, then yes. If you wrote to the file with a C++ program, then I guess the compiler aligned the data so that the whole struct is 128 bytes in size. Technically, the C++ compiler is allowed to do anything short of changing the order of the struct fields. You could just let your C++ program print out sizeof(TaqIdx) or manually divide the file size by the number of records (if you know it) to make sure.
Just looked at my old C++ code. And the struct looks like this: struct TaqIdx { char symbol[10]; int tdate; int begrec; int endrec; }__attribute__((packed)); So I am guessing I want to use the align(1) as Justin suggested. Correct? TJB
Feb 27 2012
Just looked at my old C++ code. =A0And the struct looks like this: struct TaqIdx { =A0char symbol[10]; =A0int tdate; =A0int begrec; =A0int endrec; }__attribute__((packed)); So I am guessing I want to use the align(1) as Justin suggested. Correct?
Yes.
Feb 27 2012









"Robert Jacques" <sandford jhu.edu> 