www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Read file/stream

reply nrgyzer <nrgyzer gmail.com> writes:
I'm trying to read a png file and I'm having some trouble with the
chunk-size. Each chunk of a png file begins with a 4 byte (unsigned)
integer. When I read this 4 byte integer (uint) I get an absolutely
incorrect length. My code currently looks like:

void main(string args) {

   File f = new File("test.png", FileMode.In);

   // png signature
   ubyte[8] buffer;
   f.read(buffer);

   // first chunk (IHDR)
   uint size;
   f.read(size);

   f.close();
}

When I run my code, I get 218103808 instead of 13 (decimal) or 0x0D
(hex). When I try to read the 4 byte integer as a ubyte[4]-array, I
get [0, 0, 0, 13] where 13 seems to be the correct ones because my
hex-editor says [0x00 0x00 0x00 0x0D] for these 4 bytes.

I hope anyone know where my mistake is. Thanks!
Mar 11 2011
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 11 Mar 2011 13:43:19 -0500, nrgyzer <nrgyzer gmail.com> wrote:

 I'm trying to read a png file and I'm having some trouble with the
 chunk-size. Each chunk of a png file begins with a 4 byte (unsigned)
 integer. When I read this 4 byte integer (uint) I get an absolutely
 incorrect length. My code currently looks like:

 void main(string args) {

    File f = new File("test.png", FileMode.In);

    // png signature
    ubyte[8] buffer;
    f.read(buffer);

    // first chunk (IHDR)
    uint size;
    f.read(size);

    f.close();
 }

 When I run my code, I get 218103808 instead of 13 (decimal) or 0x0D
 (hex). When I try to read the 4 byte integer as a ubyte[4]-array, I
 get [0, 0, 0, 13] where 13 seems to be the correct ones because my
 hex-editor says [0x00 0x00 0x00 0x0D] for these 4 bytes.

 I hope anyone know where my mistake is. Thanks!

http://en.wikipedia.org/wiki/Endianness Intel boxes are Little endian, which means the correct 4-byte data should be [13, 0, 0, 0]. I am not sure what facilities Phobos provides for reading/writing integers in network order (i.e. Big Endian), but I'm sure there's something. -Steve
Mar 11 2011
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 11/03/2011 18:46, Steven Schveighoffer wrote:
<snip>
 I am not sure what facilities Phobos provides for reading/writing integers in
network
 order (i.e. Big Endian), but I'm sure there's something.

http://www.digitalmars.com/d/1.0/phobos/std_stream.html EndianStream I haven't experimented with it. And I don't expect it to handle structs well. Alternatively, you could use some simple code like -------- version (BigEndian) { uint bigEndian(uint value) { return value; } } version (LittleEndian) { uint bigEndian(uint value) { return value << 24 | (value & 0x0000FF00) << 8 | (value & 0x00FF0000) >> 8 | value >> 24; } } -------- though you would have to remember to call it for each file I/O operation that relies on it. If you use a struct, you could put a method in it to call bigEndian on the members of relevance. Stewart.
Mar 11 2011
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/11/2011 11:18 AM, Stewart Gordon wrote:
 On 11/03/2011 18:46, Steven Schveighoffer wrote:
 <snip>
 I am not sure what facilities Phobos provides for reading/writing
 integers in network
 order (i.e. Big Endian), but I'm sure there's something.

http://www.digitalmars.com/d/1.0/phobos/std_stream.html EndianStream I haven't experimented with it. And I don't expect it to handle structs well. Alternatively, you could use some simple code like -------- version (BigEndian) { uint bigEndian(uint value) { return value; } } version (LittleEndian) { uint bigEndian(uint value) { return value << 24 | (value & 0x0000FF00) << 8 | (value & 0x00FF0000) >> 8 | value >> 24; } }

There is also std.intrinsic.bswap Ali
 --------

 though you would have to remember to call it for each file I/O operation
 that relies on it. If you use a struct, you could put a method in it to
 call bigEndian on the members of relevance.

 Stewart.

Mar 11 2011
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 11/03/2011 19:50, Ali Çehreli wrote:
<snip>
 There is also std.intrinsic.bswap

Well spotted. I don't tend to look at std.intrinsic much. Presumably there's a reason that it's been provided for uint but not ushort or ulong.... Stewart.
Mar 11 2011
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 11/03/2011 21:51, Steven Schveighoffer wrote:
<snip>
 Presumably there's a reason that it's been provided for uint but not ushort or
ulong....

I think things in std.intrinsic are functions that tie directly to CPU features,

True, but...
 so presumably, the CPU only provides the possibility for 4-byte width.

D is designed to run on a variety of CPUs. Do you really think that they all have a built-in instruction to reverse the order of 4 bytes but no other number? Stewart.
Mar 11 2011
prev sibling parent nrgyzer <nrgyzer gmail.com> writes:
== Auszug aus Stewart Gordon (smjg_1998 yahoo.com)'s Artikel
 On 11/03/2011 18:46, Steven Schveighoffer wrote:
 <snip>
 I am not sure what facilities Phobos provides for reading/writing


 order (i.e. Big Endian), but I'm sure there's something.

EndianStream I haven't experimented with it. And I don't expect it to handle

 Alternatively, you could use some simple code like
 --------
 version (BigEndian) {
      uint bigEndian(uint value) {
          return value;
      }
 }
 version (LittleEndian) {
      uint bigEndian(uint value) {
          return value << 24
            | (value & 0x0000FF00) << 8
            | (value & 0x00FF0000) >> 8
            | value >> 24;
      }
 }
 --------
 though you would have to remember to call it for each file I/O

 it.  If you use a struct, you could put a method in it to call

 relevance.
 Stewart.

That's working - thanks for all replies!
Mar 12 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 11 Mar 2011 16:42:59 -0500, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:

 On 11/03/2011 19:50, Ali Çehreli wrote:
 <snip>
 There is also std.intrinsic.bswap

Well spotted. I don't tend to look at std.intrinsic much. Presumably there's a reason that it's been provided for uint but not ushort or ulong....

I think things in std.intrinsic are functions that tie directly to CPU features, so presumably, the CPU only provides the possibility for 4-byte width. -Steve
Mar 11 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, March 11, 2011 14:39:43 Stewart Gordon wrote:
 On 11/03/2011 21:51, Steven Schveighoffer wrote:
 <snip>
 
 Presumably there's a reason that it's been provided for uint but not
 ushort or ulong....

I think things in std.intrinsic are functions that tie directly to CPU features,

True, but...
 so presumably, the CPU only provides the possibility for 4-byte width.

D is designed to run on a variety of CPUs. Do you really think that they all have a built-in instruction to reverse the order of 4 bytes but no other number?

You end up using ntohl and htonl, I believe. They're in core somewhere. I don't think that you necessarily get 64-bit versions versions, since unfortunately, they're not standard. But perhaps we should add them with implementations (rather than just declarations for C functions) for cases when they don't exist... IIRC, I had to create 64-bit versions for std.datetime and put them in there directly to do what I was doing, but we really should get the 64-bit versions in druntime at some point. - Jonathan M Davis
Mar 11 2011
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 11 Mar 2011 22:39:43 -0000, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:

 On 11/03/2011 21:51, Steven Schveighoffer wrote:
 <snip>
 Presumably there's a reason that it's been provided for uint but not  
 ushort or ulong....

I think things in std.intrinsic are functions that tie directly to CPU features,

True, but...
 so presumably, the CPU only provides the possibility for 4-byte width.

D is designed to run on a variety of CPUs. Do you really think that they all have a built-in instruction to reverse the order of 4 bytes but no other number?

I have some in the cryptographic hash modules which I am trying to tidy up for inclusion into phobos. They make use of bswap where possible but otherwise have to do things the long way. It would be nice to have some in std.intrinsic for 16, and 64 bit entities. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Mar 14 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 11 Mar 2011 17:39:43 -0500, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:

 On 11/03/2011 21:51, Steven Schveighoffer wrote:
 <snip>
 Presumably there's a reason that it's been provided for uint but not  
 ushort or ulong....

I think things in std.intrinsic are functions that tie directly to CPU features,

True, but...
 so presumably, the CPU only provides the possibility for 4-byte width.

D is designed to run on a variety of CPUs. Do you really think that they all have a built-in instruction to reverse the order of 4 bytes but no other number?

No, but if the CPU does not support it, the compiler must simulate it for that platform. I don't know the reasoning behind only supporting 4 bytes, but I'm sure it has something to do with IP addresses being 4 bytes that probably makes CPUs support that specific length more prevalently. I suspect the decision to create an intrinsic or just a regular function is highly subjective (i.e. how many CPUs must support an optimized version in order to make it an intrinsic), so you'd have to ask Walter why it's not in there. -Steve
Mar 14 2011
prev sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
nrgyzer <nrgyzer gmail.com> wrote:

 I'm trying to read a png file and I'm having some trouble with the
 chunk-size. Each chunk of a png file begins with a 4 byte (unsigned)
 integer. When I read this 4 byte integer (uint) I get an absolutely
 incorrect length. My code currently looks like:

 void main(string args) {

    File f = new File("test.png", FileMode.In);

    // png signature
    ubyte[8] buffer;
    f.read(buffer);

    // first chunk (IHDR)
    uint size;
    f.read(size);

    f.close();
 }

 When I run my code, I get 218103808 instead of 13 (decimal) or 0x0D
 (hex). When I try to read the 4 byte integer as a ubyte[4]-array, I
 get [0, 0, 0, 13] where 13 seems to be the correct ones because my
 hex-editor says [0x00 0x00 0x00 0x0D] for these 4 bytes.

 I hope anyone know where my mistake is. Thanks!

Looks to be an endian issue. 0x0000_000D is 218,103,808 in decimal in little-endian (Intel), and 13 in big-endian (Motorola). -- Simen
Mar 11 2011