www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Suggestions for std.stream (esp. for 64bit)

reply Daniel Gibson <metalcaedes gmail.com> writes:
I'm using some of std.stream's classes (and also SocketStream) in a 
project of mine (it currently uses D1 but std.Stream hasn't changed much 
so all this is valid for D2 as well).
std.stream is mostly what I expect from classes for Streams (having 
experience with Java's equivalents). However there are a few things that 
I think could/should be improved.

1. (important for 64bit D): write(char[]) and read(char[]) are too 
platform specific.
These methods (and the wchar counterparts) write (or read) "a string, 
together with its length". This is fine (even though I would rather 
expect writeString/readString to do this magic, but nevermind), *but* 
the length is written as a size_t - making it heavily platform dependant.
This means that you can't use write(char[]) to write into a file on a 
x86 system and later read that file on an amd64 system. Also consider 
SocketStream.. you can't use SocketStream.write(char[]) to communicate 
between a x86 and an amd64 box (when an 64bit executable is used an the 
latter).
This could easily be fixed by using uint or ulong instead of size_t on 
all platforms. (uint is probably ok, Java even uses short in a similar 
method (java.io.DataOutput.writeUTF() - never use this, it's no real 
UTF-8)).

Unfortunately the libphobos of GDC (that already supports 64bit targets) 
has been using size_t for ages, so in D1 it should maybe stay like that 
to avoid breaking compatibility (on the other hand probably no GDC user 
who thinks at least a bit cross-platform uses write(char[]) anyway - in 
that case just use uint so it's compatible with existing 32bit binaries 
from DMD).
But at least for D2/phobos2 that should be changed.

2. The documentation says for write(): "Outside of byte, ubyte, and 
char, the format is implementation-specific and should only be used in 
conjunction with read. Throw WriteException on error."
So how do I write files for other programs to read? How do I communicate 
with servers/clients not written in D with a SocketStream?
Fortunately the documentation exaggerates. Apart from write(char[]) and 
write(wchar[]) the claim is not true.
The simple types are platform specific, not specific to D/std.stream - 
so a program written in C (or any other language) is probably able to 
read that data (if it supports the type - may be tricky with real and 
maybe the imaginary types).
And by using the EndianStream, most of the types probably can be read by 
other platforms (again, real and the imaginary types might cause 
trouble.. and maybe floating point types in general, if the other 
platform doesn't support IEEE 754 floats - but integer types are 
definitely safe).

This is bad, because if someone wants to use SocketStream he's confused 
by that statement until he looks at the source to find out that it's not 
so implementation specific after all.
So please document how exactly write( (w)char[]) encodes the length and 
also make clear that write( <basic type> ) does no strange voodoo, but 
just dumps the bytes of the value (in platform specific big/little 
endian order).

3. InputStream's read( <type> val) often is inconvenient.
If you want to read an int (or any other basic type) from an InputStream 
s you have to do:
   int foo; s.read(foo);
This doesn't look soo horrible.. but maybe you want to pass that value 
directly to a function?
   int foo; s.read(foo); bar(foo);
This is inconvenient. Java has something like Stream.readInt(), so you 
can write
   bar( s.readInt() );
That's much shorter and you don't need to invent a name for that value 
you only want to use once anyway.
For D I'd suggest a templated method for that:
   T read(T)() { T ret; readExact(&ret, ret.sizeof); return ret; }
So you could write
   int foo = s.read!int; // instead of int foo; s.read(foo);
or even
   auto foo = s.read!int;
and
   bar( s.read!int ); // instead of "int foo; s.read(foo); bar(foo);"
(The implementation should probably make sure T is a basic type or maybe 
a struct, but no array or Object or pointer. Also special cases for 
char[] and wchar[] might be needed for consistency.)

4. Minor inconsistencies:
In InputStream there is a read(ubyte[]) method, but no 
readExact(ubyte[]) method.
I'd suggest adding
   void readExact(ubyte[] buf) { readExact(buf.ptr, buf.length); }
and, for convenience
   ubyte[] readExact(size_t len) {
     ubyte[] ret = new ubyte[len];
     readExact(ret.ptr, len);
     return ret;
   }

Also there is a write(ubyte[]) method in OutputStream, but no 
writeExact(ubyte[]) method, so I'd suggest adding
   void writeExact(ubyte[] buf) { // maybe "const(ubyte[]) buf" for D2
     writeExact(buf.ptr, buf.length);
   }



All code above is untested and thus to be considered pseudo-code ;-)

Cheers,
- Daniel
Oct 05 2010
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Daniel Gibson schrieb:
 I'm using some of std.stream's classes (and also SocketStream) in a 
 project of mine (it currently uses D1 but std.Stream hasn't changed much 
 so all this is valid for D2 as well).
 std.stream is mostly what I expect from classes for Streams (having 
 experience with Java's equivalents). However there are a few things that 
 I think could/should be improved.
 
 1. (important for 64bit D): write(char[]) and read(char[]) are too 
 platform specific.
 These methods (and the wchar counterparts) write (or read) "a string, 
 together with its length". This is fine (even though I would rather 
 expect writeString/readString to do this magic, but nevermind), *but* 
 the length is written as a size_t - making it heavily platform dependant.
 This means that you can't use write(char[]) to write into a file on a 
 x86 system and later read that file on an amd64 system. Also consider 
 SocketStream.. you can't use SocketStream.write(char[]) to communicate 
 between a x86 and an amd64 box (when an 64bit executable is used an the 
 latter).
 This could easily be fixed by using uint or ulong instead of size_t on 
 all platforms. (uint is probably ok, Java even uses short in a similar 
 method (java.io.DataOutput.writeUTF() - never use this, it's no real 
 UTF-8)).
 
 Unfortunately the libphobos of GDC (that already supports 64bit targets) 
 has been using size_t for ages, so in D1 it should maybe stay like that 
 to avoid breaking compatibility (on the other hand probably no GDC user 
 who thinks at least a bit cross-platform uses write(char[]) anyway - in 
 that case just use uint so it's compatible with existing 32bit binaries 
 from DMD).
 But at least for D2/phobos2 that should be changed.
 
 2. The documentation says for write(): "Outside of byte, ubyte, and 
 char, the format is implementation-specific and should only be used in 
 conjunction with read. Throw WriteException on error."
 So how do I write files for other programs to read? How do I communicate 
 with servers/clients not written in D with a SocketStream?
 Fortunately the documentation exaggerates. Apart from write(char[]) and 
 write(wchar[]) the claim is not true.
 The simple types are platform specific, not specific to D/std.stream - 
 so a program written in C (or any other language) is probably able to 
 read that data (if it supports the type - may be tricky with real and 
 maybe the imaginary types).
 And by using the EndianStream, most of the types probably can be read by 
 other platforms (again, real and the imaginary types might cause 
 trouble.. and maybe floating point types in general, if the other 
 platform doesn't support IEEE 754 floats - but integer types are 
 definitely safe).
 
 This is bad, because if someone wants to use SocketStream he's confused 
 by that statement until he looks at the source to find out that it's not 
 so implementation specific after all.
 So please document how exactly write( (w)char[]) encodes the length and 
 also make clear that write( <basic type> ) does no strange voodoo, but 
 just dumps the bytes of the value (in platform specific big/little 
 endian order).
 
 3. InputStream's read( <type> val) often is inconvenient.
 If you want to read an int (or any other basic type) from an InputStream 
 s you have to do:
   int foo; s.read(foo);
 This doesn't look soo horrible.. but maybe you want to pass that value 
 directly to a function?
   int foo; s.read(foo); bar(foo);
 This is inconvenient. Java has something like Stream.readInt(), so you 
 can write
   bar( s.readInt() );
 That's much shorter and you don't need to invent a name for that value 
 you only want to use once anyway.
 For D I'd suggest a templated method for that:
   T read(T)() { T ret; readExact(&ret, ret.sizeof); return ret; }
 So you could write
   int foo = s.read!int; // instead of int foo; s.read(foo);
 or even
   auto foo = s.read!int;
 and
   bar( s.read!int ); // instead of "int foo; s.read(foo); bar(foo);"
 (The implementation should probably make sure T is a basic type or maybe 
 a struct, but no array or Object or pointer. Also special cases for 
 char[] and wchar[] might be needed for consistency.)
 
 4. Minor inconsistencies:
 In InputStream there is a read(ubyte[]) method, but no 
 readExact(ubyte[]) method.
 I'd suggest adding
   void readExact(ubyte[] buf) { readExact(buf.ptr, buf.length); }
 and, for convenience
   ubyte[] readExact(size_t len) {
     ubyte[] ret = new ubyte[len];
     readExact(ret.ptr, len);
     return ret;
   }
 
 Also there is a write(ubyte[]) method in OutputStream, but no 
 writeExact(ubyte[]) method, so I'd suggest adding
   void writeExact(ubyte[] buf) { // maybe "const(ubyte[]) buf" for D2
     writeExact(buf.ptr, buf.length);
   }
 
 
 
 All code above is untested and thus to be considered pseudo-code ;-)
 
 Cheers,
 - Daniel

I filed two bug reports on this: http://d.puremagic.com/issues/show_bug.cgi?id=5001 for the 64bit write(char[]) issue http://d.puremagic.com/issues/show_bug.cgi?id=5002 for the other suggestions.
Oct 06 2010
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 05 Oct 2010 19:46:37 -0400, Daniel Gibson <metalcaedes gmail.com>  
wrote:

 I'm using some of std.stream's classes (and also SocketStream) in a  
 project of mine (it currently uses D1 but std.Stream hasn't changed much  
 so all this is valid for D2 as well).
 std.stream is mostly what I expect from classes for Streams (having  
 experience with Java's equivalents). However there are a few things that  
 I think could/should be improved.

[snip] Sorry to have missed this earlier -- std.stream is going to be deprecated AFAIK. std.stdio is what is planned to replace it. -Steve
Oct 06 2010
parent Daniel Gibson <metalcaedes gmail.com> writes:
Steven Schveighoffer schrieb:
 On Tue, 05 Oct 2010 19:46:37 -0400, Daniel Gibson 
 <metalcaedes gmail.com> wrote:
 
 I'm using some of std.stream's classes (and also SocketStream) in a 
 project of mine (it currently uses D1 but std.Stream hasn't changed 
 much so all this is valid for D2 as well).
 std.stream is mostly what I expect from classes for Streams (having 
 experience with Java's equivalents). However there are a few things 
 that I think could/should be improved.

[snip] Sorry to have missed this earlier -- std.stream is going to be deprecated AFAIK. std.stdio is what is planned to replace it. -Steve

That's a pity, I kind of like the streams concept. It's handy to be able to e.g. use a file and a network stream alike (of course the latter is not seekable, but if you just want to read/append that doesn't matter). However IMHO at least that 64bit write(char[]) issue should be fixed now that DMD (also for D1) is ported to AMD64. Cheers, - Daniel
Oct 06 2010