www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - iff module

reply Fredrik Olsson <peylow gmail.com> writes:
Hi.

My first contribution; a module for handling platform independent data 
using the EA IFF 85 standard 
(http://www.dcs.ed.ac.uk/home/mxr/gfx/2d/IFF.txt).

Left to implement is untyped containers, lists of arbitary size, UBYTE, 
WORD, and UWORD load data types. Then it should follow the spec the letter.

I give links to easy readable HTML, and avoid contaminating with 
atachments in the process:
http://peylow.no-ip.org/~peylow/iff.html
http://peylow.no-ip.org/~peylow/ifftest.html

So what is it good for? First of all reading fileformats based on IFF, 
IFF-images, and AIFF-audio comes to mind. Then we have the IFF drivates 
such as WAV (Have to patch for little endian there as the IFF standard 
clearly says big endian, and yet Microsoft decided little would do).
It is also an easy way to do platform independant network protocolls. 
Less data then XML, and far easier to parse.

So what will happen? First of all writing comments for how to use and 
possible extend the code! If there is a generally accepted format for 
writing comments for doc generators that the D people preffer, please 
tell me before I begin :). I am also thinking of adding the extensions 
introduced by Newtek for Lightwave 3D file formats. Or at least the 
short lists, to be able to read object files.


Well, it is my first go at D, so first of all I would like some input on 
my coding style. What is totally unacceptable and what could be better. 
If something is good, please say so as well :).

regards
	Fredrik Olsson
Aug 07 2005
next sibling parent "Regan Heath" <regan netwin.co.nz> writes:
Minor bug report:
- Exception in "toFloat()" reads "Could not be cast as int." (note 'int')

Using a struct means you do not get constructors and instead have to go  
for stand alone functions to construct the "Value" type. Or did you prefer  
the stand alone functions? I think this is another example of where struct  
constructors would be very handy, mainly from a design standpoint.

Coding style looks fine. I didn't see anything I would call "bad coding  
style" from my cursory examination. Of course coding style is mostly  
personal preference.

Regan

On Mon, 08 Aug 2005 00:21:15 +0200, Fredrik Olsson <peylow gmail.com>  
wrote:
 Hi.

 My first contribution; a module for handling platform independent data  
 using the EA IFF 85 standard  
 (http://www.dcs.ed.ac.uk/home/mxr/gfx/2d/IFF.txt).

 Left to implement is untyped containers, lists of arbitary size, UBYTE,  
 WORD, and UWORD load data types. Then it should follow the spec the  
 letter.

 I give links to easy readable HTML, and avoid contaminating with  
 atachments in the process:
 http://peylow.no-ip.org/~peylow/iff.html
 http://peylow.no-ip.org/~peylow/ifftest.html

 So what is it good for? First of all reading fileformats based on IFF,  
 IFF-images, and AIFF-audio comes to mind. Then we have the IFF drivates  
 such as WAV (Have to patch for little endian there as the IFF standard  
 clearly says big endian, and yet Microsoft decided little would do).
 It is also an easy way to do platform independant network protocolls.  
 Less data then XML, and far easier to parse.

 So what will happen? First of all writing comments for how to use and  
 possible extend the code! If there is a generally accepted format for  
 writing comments for doc generators that the D people preffer, please  
 tell me before I begin :). I am also thinking of adding the extensions  
 introduced by Newtek for Lightwave 3D file formats. Or at least the  
 short lists, to be able to read object files.


 Well, it is my first go at D, so first of all I would like some input on  
 my coding style. What is totally unacceptable and what could be better.  
 If something is good, please say so as well :).

 regards
 	Fredrik Olsson

Aug 07 2005
prev sibling next sibling parent reply Fredrik Olsson <peylow gmail.com> writes:
Fredrik Olsson wrote:
<snip>
 Well, it is my first go at D, so first of all I would like some input on 
 my coding style. What is totally unacceptable and what could be better. 
 If something is good, please say so as well :).
 

One thing I would like to do myself. As now I have many overloaded functions for example "readFromStream(Stream stream, etc" and "readFromStream(EndianSytream stream, etc". I would like to have funtion like this: void ensureEndianStream(inout Stream stream) { if (!(stream isa EndianStream)) { stream = new EndiamStream(stream, Endian.BigEndian); } } I have not been able to find a way to do what I did with the "isa" operator in my pseudo code above. How do I check an instance class? Both exact class and decendant of? regards Fredrik Olsson
Aug 07 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 08 Aug 2005 01:08:23 +0200, Fredrik Olsson <peylow gmail.com>  
wrote:
 Fredrik Olsson wrote:
 <snip>
 Well, it is my first go at D, so first of all I would like some input  
 on my coding style. What is totally unacceptable and what could be  
 better. If something is good, please say so as well :).

One thing I would like to do myself. As now I have many overloaded functions for example "readFromStream(Stream stream, etc" and "readFromStream(EndianSytream stream, etc". I would like to have funtion like this: void ensureEndianStream(inout Stream stream) { if (!(stream isa EndianStream)) { stream = new EndiamStream(stream, Endian.BigEndian); } } I have not been able to find a way to do what I did with the "isa" operator in my pseudo code above. How do I check an instance class? Both exact class and decendant of?

Use the cast() operator, cast(EndianStream) will return null if the stream is not EndianStream or a descendant of. eg. void ensureEndianStream(inout Stream stream) { if (cast(EndianStream)stream is null) { stream = new EndianStream(stream,Endian.BigEndian); } } Regan
Aug 07 2005
prev sibling parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
 Well, it is my first go at D, so first of all I would like some input on
 my coding style. What is totally unacceptable and what could be better. If 
 something is good, please say so as well :).

You might want to look at the std.boxer module with the Box type. It looks similar to your Value type. Glad to see someone using EndianStream :-).
Aug 07 2005
next sibling parent Fredrik Olsson <peylow gmail.com> writes:
Ben Hinkle wrote:
Well, it is my first go at D, so first of all I would like some input on
my coding style. What is totally unacceptable and what could be better. If 
something is good, please say so as well :).

You might want to look at the std.boxer module with the Box type. It looks similar to your Value type.

I would like to, but unfortunately it fails linking meserably at unboxing char[] with a missing __init_11TypeInfo_Pv symbol. I have asked about it in a previous thread and it is a known error (For gdc at least?), reported to DStress and all. I will probably change to boxer once the bug is resolved, less code is better code :).
 Glad to see someone using EndianStream :-). 
 

I loved it, the most natural place to put endian code I can think of. The C version is littered with macros to do the swapping, and it took days just to find all places where it was aproperiate/not aproperiate. So cleaning up this piece of code and then of to write a SslSocket class :). regards Fredrik Olsson
Aug 07 2005
prev sibling parent reply Niko Korhonen <niktheblak hotmail.com> writes:
Ben Hinkle wrote:
 Glad to see someone using EndianStream :-). 

I use it in my tag library (ID3/APE/Lyrics/Vorbis/etc) to read/write binary data in correct endian format. It will be published Real Soon Now(tm) :) Actually there's a bit of a problem because my functions are in the format write(Stream s) and I can't be sure whether the provided stream is already an EndianStream or not. I've considered something like: <code> write(Stream s) { EndianStream es = new EndianStream(s, Endian.LittleEndian); // ... } </code> But this is wasteful if s is already an EndianStream. Problem with casting the argument Stream to EndianStream or taking an EndianStream as argument is that I can't change the endianness of the stream. Instead I have to do something awfully verbose like this: <code> write(Stream s) { EndianStream es = cast(EndianStream)s; if (es is null) es = new EndianStream(s, Endian.LittleEndian); Endian original = es.endian; es.endian = Endian.LittleEndian; // Do some I/O es.endian = original; } </code> Somehow I just don't like the look of that code, would anyone have some improvement suggestions? -- Niko Korhonen SW Developer
Aug 08 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Niko Korhonen" <niktheblak hotmail.com> wrote in message 
news:dd7c6e$25dr$1 digitaldaemon.com...
 Ben Hinkle wrote:
 Glad to see someone using EndianStream :-).

I use it in my tag library (ID3/APE/Lyrics/Vorbis/etc) to read/write binary data in correct endian format. It will be published Real Soon Now(tm) :) Actually there's a bit of a problem because my functions are in the format write(Stream s) and I can't be sure whether the provided stream is already an EndianStream or not. I've considered something like: <code> write(Stream s) { EndianStream es = new EndianStream(s, Endian.LittleEndian); // ... } </code> But this is wasteful if s is already an EndianStream. Problem with casting the argument Stream to EndianStream or taking an EndianStream as argument is that I can't change the endianness of the stream. Instead I have to do something awfully verbose like this: <code> write(Stream s) { EndianStream es = cast(EndianStream)s; if (es is null) es = new EndianStream(s, Endian.LittleEndian); Endian original = es.endian; es.endian = Endian.LittleEndian; // Do some I/O es.endian = original; } </code>

I hadn't thought about people switching endianness on an existing stream in use. Currently that will be a problem with the ungetcw function since the unget buffer stores the data with the endianness already applied. I'll probably add an endianness setter that runs through the ungetcw buffer and swaps any content or maybe I'll have to store the ungetcw buffer in the same order as the source.
 Somehow I just don't like the look of that code, would anyone have some 
 improvement suggestions?

A try-finally is probably needed around the Do some I/O so that the original endianness is restored. It is pretty ugly, I agree. Then again if someone hands you a big-endian stream and you change it to little endian something fishy is going on between the user and the library. An assert might be better or a run-time check and throw if they hand you the wrong endianness.
 -- 
 Niko Korhonen
 SW Developer 

Aug 08 2005
parent reply Niko Korhonen <niktheblak hotmail.com> writes:
Ben Hinkle wrote:
 I hadn't thought about people switching endianness on an existing stream in 
 use.

Ok, somehow I anticipated that. But it's not really a problem for me anymore since I decided just to create a new EndianStream instance at top-level containers and pass that around when doing the serialization routine. It's not /that/ wasteful to create just a one little EndianStream instance per a written/read tag methinks :) However, there should be something in the docs about switching endianness if it causes problems. A read-only property, perhaps?
 Then again if someone 
 hands you a big-endian stream and you change it to little endian something 
 fishy is going on between the user and the library.

Yep, I can't change the endianness of an argument stream because the caller will not expect that. Just creating a new EndianStream instance and accepting the microscopic cost involved is about an order-of-magnitude safer anyway. -- Niko Korhonen SW Developer
Aug 08 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Niko Korhonen" <niktheblak hotmail.com> wrote in message 
news:dd7nvl$2g91$1 digitaldaemon.com...
 Ben Hinkle wrote:
 I hadn't thought about people switching endianness on an existing stream 
 in use.

Ok, somehow I anticipated that. But it's not really a problem for me anymore since I decided just to create a new EndianStream instance at top-level containers and pass that around when doing the serialization routine. It's not /that/ wasteful to create just a one little EndianStream instance per a written/read tag methinks :) However, there should be something in the docs about switching endianness if it causes problems. A read-only property, perhaps?
 Then again if someone hands you a big-endian stream and you change it to 
 little endian something fishy is going on between the user and the 
 library.

Yep, I can't change the endianness of an argument stream because the caller will not expect that. Just creating a new EndianStream instance and accepting the microscopic cost involved is about an order-of-magnitude safer anyway. -- Niko Korhonen SW Developer

OK. Now that I think about it repeatedly wrapping a stream with another stream and going back and forth might wind up with some odd state in the original stream wrt unget buffers and end-of-line flags. For example if a stream has a non-empty unget buffer and it is wrapped with an EndianStream the EndianStream keeps its own unget buffer and so it isn't in sync with the source unget buffer. When the user goes back to using the source stream the unget buffer is the same as when it started while the EndianStream might still have content in it. I should add some documentation that if you wrap and unwrap a stream that is already in use the binary operations are the only truly safe operations. The unget and end-of-line state between the source and wrapper stream are not kept in sync. Also it would be nice to say in the EndianStream doc that you should always have the EndianStream at the head of the wrappers otherwise no endian swapping gets applied since other wrappers talk to the source stream via bytes only.
Aug 08 2005
parent reply Niko Korhonen <niktheblak hotmail.com> writes:
Ben Hinkle wrote:
 OK. Now that I think about it repeatedly wrapping a stream with another 
 stream and going back and forth might wind up with some odd state in the 
 original stream wrt unget buffers and end-of-line flags.

What would you suggest then? Especially what would be the nicest way of using EndianStream in a library function that takes a Stream argument and where no assumptions can be made about the passed Stream object: 1) Always create an EndianStream wrapper. 2) Check whether the given stream is EndianStream. If yes, use that stream. If not, create an EndianStream wrapper. 3) Take an EndianStream argument instead of a Stream thus forcing the client to create an EndianStream wrapper. 4) Ignore endianness completely. The client will create an EndianStream wrapper if needed.
 I should add some documentation that if you wrap and unwrap a stream that is 
 already in use the binary operations are the only truly safe operations.

Agreed. -- Niko Korhonen SW Developer
Aug 08 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Niko Korhonen" <niktheblak hotmail.com> wrote in message 
news:dd86q0$30sl$1 digitaldaemon.com...
 Ben Hinkle wrote:
 OK. Now that I think about it repeatedly wrapping a stream with another
 stream and going back and forth might wind up with some odd state in the
 original stream wrt unget buffers and end-of-line flags.

What would you suggest then? Especially what would be the nicest way of using EndianStream in a library function that takes a Stream argument and where no assumptions can be made about the passed Stream object: 1) Always create an EndianStream wrapper. 2) Check whether the given stream is EndianStream. If yes, use that stream. If not, create an EndianStream wrapper. 3) Take an EndianStream argument instead of a Stream thus forcing the client to create an EndianStream wrapper. 4) Ignore endianness completely. The client will create an EndianStream wrapper if needed.

Since in your initial post you mentioned your code only does binary i/o then you can safely wrap. I'd do 2a) which is 2) but throwing if the EndianStream is the wrong endianness. That way the user can make their own EndianStream if they don't want the performance hit of making lots of little wrappers and yet if they make the wrong EndianStream they'll find out about it quickly. If you want you can write a little helper method "getCompatibleStream" or something: class Foo { // returns a stream possibly wrapping the source stream s that // is compatible with class Foo. Returns null if s can't be made compatible. // Using the same compatible stream instance to write multiple Foo // instances is more efficient than getting a compatible stream for // each Foo instance. Stream getCompatibleStream(Stream s) { assert( s ); EndianStream es = cast(EndianStream)s; if (!es) es = new EndianStream(s,Endian.LittleEndian); is (es.endian != Endian.LittleEndian) es = null; return es; } void write(Stream s) { s = getCompatibleStream(s); assert( s ); ... carry on, nothing to see here ... }
 I should add some documentation that if you wrap and unwrap a stream that 
 is
 already in use the binary operations are the only truly safe operations.

Agreed. -- Niko Korhonen SW Developer

Aug 08 2005
parent reply Niko Korhonen <niktheblak hotmail.com> writes:
How about using a stream and a wrapper simultaneously, something like this:

void write(Stream s)
{
   EndianStream es = new EndianStream(s, Endian.LittleEndian);
   es.write(/* uint argument */);
   s.writeString(/* string argument*/ );
   es.write(/* long argument */);
   s.write(/* ubyte argument */);
}

Do you see any problems with that?

-- 
Niko Korhonen
SW Developer
Aug 09 2005
parent "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Niko Korhonen" <niktheblak hotmail.com> wrote in message 
news:dd9ldm$1hr8$1 digitaldaemon.com...
 How about using a stream and a wrapper simultaneously, something like 
 this:

 void write(Stream s)
 {
   EndianStream es = new EndianStream(s, Endian.LittleEndian);
   es.write(/* uint argument */);
   s.writeString(/* string argument*/ );
   es.write(/* long argument */);
   s.write(/* ubyte argument */);
 }

 Do you see any problems with that?

 -- 
 Niko Korhonen
 SW Developer

Stream.writeString is fine since it writes the length as binary followed by the string contents. The problem methods are the ones that call getc/getcw like readLine and readf. Now that I think about it, though, mixing get/unget with binary reading isn't a good idea even with one stream let alone wrapper (I should have thought of that earlier - sorry!). The unget buffer is not checked by the binary read methods so anything you unget sits unnoticed until you getc again and it magically pops back up. I believe this matches how C streams work but I haven't ever actually tried mixing getc/ungetc with fread.
Aug 09 2005