www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - non-seekable streams and size()

reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
What should size() of a non-seekable stream return or do? Currently it 
depends on the stream type: for a general stream it throws a SeekException 
and for a File on Windows it returns 0 (which is just what GetFileSize 
returns for non-seekable streams like pipes). I'm tempted to have it return 
ulong.max. Any objections?

While I'm at it I'm making eof testing more efficient for both seekable and 
non-seekable streams by using the convention that if readBlock returns 0 
then the stream is at eof (and I'd like to document that). Technically that 
wasn't part of the existing readBlock's documentation but it's what happens 
in practice and it comes in handy with non-seekable streams. 
Apr 17 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
Out of scope probably....

Imho, "seekable" stream is a nonsense.

If stream is seakable then it is a vector.
Almost in all cases such stream could be represented
as char[] or wchar[], etc. MM files allows to expand
this not only on heap memory but to the file access.

For text IO it makes sense to support simple idiom of
formatting Writer and Reader's.

class Writer { this(IPutChar inp){} uint writef(...) {}  }
class Reader { this(IGetChar outp){} uint readf(...) {}  }

I guess this is just enough for implementation of
stdio/stdout style of applications.

C++ <stream> and co. are so universal, theoretical and generic
that it is almost not used in real life in pure form.
These << and >> are sounds good for first semester student
but is a nightmare when you will try to output/input something
formatted for real life. And yet << and >> are "poor C++ man"
approach to handle types of unisex arguments.

Our old friends printf/writef and scanf/readf
are time proven and do realy work. In D
when you have (seems like :-) acces to TypeInfo of arguments
writef/readf are just perfect - compact and powerfull.

a?

IMHO, IMHO and again IMHO.

Andrew.

"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:d3u2i7$1vbi$1 digitaldaemon.com...
 What should size() of a non-seekable stream return or do? Currently it 
 depends on the stream type: for a general stream it throws a SeekException 
 and for a File on Windows it returns 0 (which is just what GetFileSize 
 returns for non-seekable streams like pipes). I'm tempted to have it 
 return ulong.max. Any objections?

 While I'm at it I'm making eof testing more efficient for both seekable 
 and non-seekable streams by using the convention that if readBlock returns 
 0 then the stream is at eof (and I'd like to document that). Technically 
 that wasn't part of the existing readBlock's documentation but it's what 
 happens in practice and it comes in handy with non-seekable streams.
 
Apr 17 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:d3u9cf$25du$1 digitaldaemon.com...
 Out of scope probably....

 Imho, "seekable" stream is a nonsense.

 If stream is seakable then it is a vector.
Files on disk are seekable and they can be too large or too cumbersome to fit into memory.
 Almost in all cases such stream could be represented
 as char[] or wchar[], etc. MM files allows to expand
 this not only on heap memory but to the file access.
The classic example is a large file of binary data organized into many chunks of the same size (ie a huge array of structs on disk). Random access to such data requires seeking. Is such a situation infrequent enough to be ignored? It's a reasonable question. Some APIs don't allow random access and instead have some streams support a mark/reset API.
 For text IO it makes sense to support simple idiom of
 formatting Writer and Reader's.

 class Writer { this(IPutChar inp){} uint writef(...) {}  }
 class Reader { this(IGetChar outp){} uint readf(...) {}  }

 I guess this is just enough for implementation of
 stdio/stdout style of applications.
Std.stream has writef and scanf in OutputStream and InputStream interfaces and implemented in Stream. Suggestions for improving InputStream and OutputStream are always welcome.
 C++ <stream> and co. are so universal, theoretical and generic
 that it is almost not used in real life in pure form.
 These << and >> are sounds good for first semester student
 but is a nightmare when you will try to output/input something
 formatted for real life. And yet << and >> are "poor C++ man"
 approach to handle types of unisex arguments.
It will probably be a while (if ever) before << and >> become part of std.stream.
 Our old friends printf/writef and scanf/readf
 are time proven and do realy work. In D
 when you have (seems like :-) acces to TypeInfo of arguments
 writef/readf are just perfect - compact and powerfull.
agreed.
 a?
?
 IMHO, IMHO and again IMHO.
no problem.
 Andrew.

 "Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
 news:d3u2i7$1vbi$1 digitaldaemon.com...
 What should size() of a non-seekable stream return or do? Currently it 
 depends on the stream type: for a general stream it throws a 
 SeekException and for a File on Windows it returns 0 (which is just what 
 GetFileSize returns for non-seekable streams like pipes). I'm tempted to 
 have it return ulong.max. Any objections?

 While I'm at it I'm making eof testing more efficient for both seekable 
 and non-seekable streams by using the convention that if readBlock 
 returns 0 then the stream is at eof (and I'd like to document that). 
 Technically that wasn't part of the existing readBlock's documentation 
 but it's what happens in practice and it comes in handy with non-seekable 
 streams.
Apr 17 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
Hi, Ben, see below:

"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:d3udbh$2945$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
 news:d3u9cf$25du$1 digitaldaemon.com...
 Out of scope probably....

 Imho, "seekable" stream is a nonsense.

 If stream is seakable then it is a vector.
Files on disk are seekable and they can be too large or too cumbersome to fit into memory.
Ummm.... memory mapped files ( at least in Win32 ) are not mapped in the whole. Only 4k pages you are getting access to. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/msdn_manamemo.asp So it is not an issue.
 Almost in all cases such stream could be represented
 as char[] or wchar[], etc. MM files allows to expand
 this not only on heap memory but to the file access.
The classic example is a large file of binary data organized into many chunks of the same size (ie a huge array of structs on disk). Random access to such data requires seeking. Is such a situation infrequent enough to be ignored? It's a reasonable question. Some APIs don't allow random access and instead have some streams support a mark/reset API.
What is wrong with classic fread/fwrite in "rb"/"wb" modes ? They just work.
 For text IO it makes sense to support simple idiom of
 formatting Writer and Reader's.

 class Writer { this(IPutChar inp){} uint writef(...) {}  }
 class Reader { this(IGetChar outp){} uint readf(...) {}  }

 I guess this is just enough for implementation of
 stdio/stdout style of applications.
Std.stream has writef and scanf in OutputStream and InputStream interfaces and implemented in Stream. Suggestions for improving InputStream and OutputStream are always welcome.
Text IO and binary IO are, IMHO, too different entities and it is better to do not mix them and to use something like this: class writer { this(IPutChar inp){} uint writef(...) {} } class reader { this(IGetChar outp){} uint readf(...) {} } class bin_writer { this(IPutByte inp){} uint write(...) {} } class bin_reader { this(IGetByte outp){} uint read(...) {} } The main difference of bin_writer/reader from fread/fwrite is that they use some uniform format for binary data common for little/big endians. Text reader/writer should take care about encodings. Various implementations of IPutChar and IPutByte - this all we need. Like: IGetByte File.byteSrc(): IGetChar File.charSrc(): IGetByte Socket.byteSrc(): IGetChar Socket.charSrc(): IGetByte byteSrc(ubyte[]): IGetChar charSrc(ubyte[]): interface IGetChar { bool fetch(out dchar c); } interface IGetByte { bool fetch(out ubyte b); } interface IPutChar { bool store(dchar c); } interface IPutByte { bool store(ubyte b); }
 C++ <stream> and co. are so universal, theoretical and generic
 that it is almost not used in real life in pure form.
 These << and >> are sounds good for first semester student
 but is a nightmare when you will try to output/input something
 formatted for real life. And yet << and >> are "poor C++ man"
 approach to handle types of unisex arguments.
It will probably be a while (if ever) before << and >> become part of std.stream.
Please don't do that. If anyone needs this idiom (e.g. Mango) then opShl and opShr implementation is just matter of minutes in some particular place knows about what format to use and how exactly to emit/inject stuff.
 Our old friends printf/writef and scanf/readf
 are time proven and do realy work. In D
 when you have (seems like :-) acces to TypeInfo of arguments
 writef/readf are just perfect - compact and powerfull.
agreed.
 a?
?
:) Nothing, eh?
Apr 17 2005
parent "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:d3ugvl$2cd2$1 digitaldaemon.com...
 Hi, Ben, see below:

 "Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
 news:d3udbh$2945$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
 news:d3u9cf$25du$1 digitaldaemon.com...
 Out of scope probably....

 Imho, "seekable" stream is a nonsense.

 If stream is seakable then it is a vector.
Files on disk are seekable and they can be too large or too cumbersome to fit into memory.
Ummm.... memory mapped files ( at least in Win32 ) are not mapped in the whole. Only 4k pages you are getting access to. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/msdn_manamemo.asp So it is not an issue.
well, true that you can map and unmap different parts of the file. I was lumping that into "cumbersome" but I suppose it isn't that bad.
 Almost in all cases such stream could be represented
 as char[] or wchar[], etc. MM files allows to expand
 this not only on heap memory but to the file access.
The classic example is a large file of binary data organized into many chunks of the same size (ie a huge array of structs on disk). Random access to such data requires seeking. Is such a situation infrequent enough to be ignored? It's a reasonable question. Some APIs don't allow random access and instead have some streams support a mark/reset API.
What is wrong with classic fread/fwrite in "rb"/"wb" modes ? They just work.
seekable streams just work, too :-)
 For text IO it makes sense to support simple idiom of
 formatting Writer and Reader's.

 class Writer { this(IPutChar inp){} uint writef(...) {}  }
 class Reader { this(IGetChar outp){} uint readf(...) {}  }

 I guess this is just enough for implementation of
 stdio/stdout style of applications.
Std.stream has writef and scanf in OutputStream and InputStream interfaces and implemented in Stream. Suggestions for improving InputStream and OutputStream are always welcome.
Text IO and binary IO are, IMHO, too different entities and it is better to do not mix them and to use something like this: class writer { this(IPutChar inp){} uint writef(...) {} } class reader { this(IGetChar outp){} uint readf(...) {} } class bin_writer { this(IPutByte inp){} uint write(...) {} } class bin_reader { this(IGetByte outp){} uint read(...) {} } The main difference of bin_writer/reader from fread/fwrite is that they use some uniform format for binary data common for little/big endians.
EndianStream allows custom control of the binary data endianess - and it covers the endianess of wchar strings, too.
 Text reader/writer should take care about encodings.
Since D is UTF-centric so too is std.stream - although it is missing the dchar functions. It would be nice if phobos had some helpers for managing encodings, but that's a slightly messy area to get into.
 Various implementations of IPutChar and  IPutByte - this all we
 need.
 Like:

      IGetByte File.byteSrc():
      IGetChar File.charSrc():
      IGetByte Socket.byteSrc():
      IGetChar Socket.charSrc():

      IGetByte byteSrc(ubyte[]):
      IGetChar charSrc(ubyte[]):

 interface IGetChar
 {
    bool fetch(out dchar c);
 }
 interface IGetByte
 {
    bool fetch(out ubyte b);
 }

 interface IPutChar
 {
    bool store(dchar c);
 }
 interface IPutByte
 {
    bool store(ubyte b);
 }
That's a reasonable approach (assuming the rest of the API would be rich enough to do all the things std.stream does). I think Mango does something similar though I can't remember. I tend to like the simplicity of std.stream. You just get a File (or whatever) and use it. Plus there is enough overlap between all the text_read/bin_read/text_write/bin_write that personally I think it makes sense to lump everything together. If anything the file std.stream is getting a tad large so maybe some of the less common streams can go into a different module.
Apr 17 2005
prev sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Ben Hinkle wrote:
 "Andrew Fedoniouk" <news terrainformatica.com> wrote:
 Almost in all cases such stream could be represented as char[] or
 wchar[], etc. MM files allows to expand this not only on heap
 memory but to the file access.
The classic example is a large file of binary data organized into many chunks of the same size (ie a huge array of structs on disk). Random access to such data requires seeking. Is such a situation infrequent enough to be ignored? It's a reasonable question. Some APIs don't allow random access and instead have some streams support a mark/reset API.
IMHO, the more things grow, the more things grow. Hard disks will stay larger than memory, and therefore we cannot start relying on MM files only. Seekability has "always" been one of the cornerstones in file handling. I'd (almost) go as far as saying, that no serious RDBMS can be built without seekability. Since D is a "systems language", there's no way we can skip seekability. (We all do want Oracle to be ported to D, don't we? :-) ) However, any input where you don't know the size of the entire input, seeking is something you don't do. (And don't let the VB-guy try to do.)
Apr 17 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 IMHO, the more things grow, the more things grow.

 Hard disks will stay larger than memory, and therefore we cannot start 
 relying on MM files only.
Yes. Not only. But... Please read rationale in Konstantin Knizhnik FastDB http://www.garret.ru/~knizhnik/fastdb/FastDB.htm Andrew.
Apr 17 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
Sorry this is the URL
http://www.garret.ru/~knizhnik/fastdb.html 
Apr 17 2005
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Thanks for the link. I'll read that as soon as I have time. Looks 
promising for quite a few projects of mine!


Andrew Fedoniouk wrote:
 Sorry this is the URL
 http://www.garret.ru/~knizhnik/fastdb.html 
 
 
Apr 17 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
BTW: Other extremly simple DB (well, sort of) which could be used
in D "as is" is described in my article on CodeProject
http://www.codeproject.com/cpp/flattables.asp

Andrew.


"Georg Wrede" <georg.wrede nospam.org> wrote in message 
news:4262F2C3.4050507 nospam.org...
 Thanks for the link. I'll read that as soon as I have time. Looks 
 promising for quite a few projects of mine!


 Andrew Fedoniouk wrote:
 Sorry this is the URL
 http://www.garret.ru/~knizhnik/fastdb.html 
Apr 17 2005
prev sibling next sibling parent reply Georg Wrede <georg.wrede nospam.org> writes:
Size() implies seekability.

Someone using size() on non-seekable streams is making a programmer 
error, IMHO. My suggestion is a non-quenchable error.


Ben Hinkle wrote:
 What should size() of a non-seekable stream return or do? Currently it 
 depends on the stream type: for a general stream it throws a SeekException 
 and for a File on Windows it returns 0 (which is just what GetFileSize 
 returns for non-seekable streams like pipes). I'm tempted to have it return 
 ulong.max. Any objections?
 
 While I'm at it I'm making eof testing more efficient for both seekable and 
 non-seekable streams by using the convention that if readBlock returns 0 
 then the stream is at eof (and I'd like to document that). Technically that 
 wasn't part of the existing readBlock's documentation but it's what happens 
 in practice and it comes in handy with non-seekable streams. 
 
 
Apr 17 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
That was my first thought also. However...

Technically it's possible to have a stream which knows how long it is, but  
is not seekable.
Practically I'm struggling to think of an example.

On Sun, 17 Apr 2005 21:22:25 +0300, Georg Wrede <georg.wrede nospam.org>  
wrote:
 Size() implies seekability.

 Someone using size() on non-seekable streams is making a programmer  
 error, IMHO. My suggestion is a non-quenchable error.


 Ben Hinkle wrote:
 What should size() of a non-seekable stream return or do? Currently it  
 depends on the stream type: for a general stream it throws a  
 SeekException and for a File on Windows it returns 0 (which is just  
 what GetFileSize returns for non-seekable streams like pipes). I'm  
 tempted to have it return ulong.max. Any objections?
  While I'm at it I'm making eof testing more efficient for both  
 seekable and non-seekable streams by using the convention that if  
 readBlock returns 0 then the stream is at eof (and I'd like to document  
 that). Technically that wasn't part of the existing readBlock's  
 documentation but it's what happens in practice and it comes in handy  
 with non-seekable streams.
Apr 17 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 Practically I'm struggling to think of an example.
HTTP GET stream. Client may know resource length upfront (HEAD request) but such stream is not seakable. It is just socket stream. But in any case that resource length is not the knowledge HTTP GET shall rely on.
Apr 17 2005
next sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 17 Apr 2005 18:59:08 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 Practically I'm struggling to think of an example.
HTTP GET stream. Client may know resource length upfront (HEAD request) but such stream is not seakable.
You mean defined in the Content-Length header? Or are we talking HTTP 1.1 which sends lengths then data? (IIRC) Either way, if you instantiate the GET stream _after_ reading the length then I guess it could know it's length... if you instantiate before reading the length then there is a period where it has an indeterminate length. Right?
 It is just socket stream.
So it doesn't know the length itself.
 But in any case that resource length is not the knowledge HTTP GET
 shall rely on.
Because the socket could close prematurely. You can only really know the length of a socket once it closes (once you know where it ends) Regan
Apr 17 2005
prev sibling parent reply Ben Hinkle <Ben_member pathlink.com> writes:
In article <d3v49f$2s9p$1 digitaldaemon.com>, Andrew Fedoniouk says...
 Practically I'm struggling to think of an example.
HTTP GET stream. Client may know resource length upfront (HEAD request) but such stream is not seakable. It is just socket stream. But in any case that resource length is not the knowledge HTTP GET shall rely on.
I haven't thought about the details but I could imagine a compressed stream like ZipStream that knows its length but isn't seekable.
Apr 17 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 I haven't thought about the details but I could imagine a compressed 
 stream like
 ZipStream that knows its length but isn't seekable.
quod erat demonstrandum. :) I told you! seakable streams are nonsense. Even for "simple" cases like text files.... Text streams in different encodings by definition has no "postion" As stream output may depend on physical byte number ***and*** previous state of the stream. Just think about it. Have you ever use any stream with positioning in your practice? It is either pure stream (getNextChar) or sort of block-oriented access like fread/fwrite. But not their mix. Andrew.
Apr 17 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:d3v9ps$31hn$1 digitaldaemon.com...
 I haven't thought about the details but I could imagine a compressed 
 stream like ZipStream that knows its length but isn't seekable.
quod erat demonstrandum. :)
au contraire, existence of one does doesn't imply non-existence of the other. :-) Actually now that I think about it some more a program that extracts records from a Zip file most likely skips immediately the file position with the file to extract and reads from there. I looked at std.zip and it only works on data in memory but it essentially does that.
 I told you! seakable streams are nonsense.
 Even for "simple" cases like text files....
 Text streams in different encodings  by definition has no "postion"
 As stream output may depend on physical byte number ***and***
 previous state of the stream.
position is defined by byte position, not character position (unless the encoding is dchars).
 Just think about it. Have you ever use any stream with positioning
 in your practice? It is either pure stream (getNextChar) or sort of
 block-oriented access like fread/fwrite. But not their mix.
yes I've used positioning with MATLAB many times. Finding and reading chunks of a file without having to read from the start is handy.
Apr 18 2005
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Ben Hinkle wrote:
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
 news:d3v9ps$31hn$1 digitaldaemon.com...
 
 I haven't thought about the details but I could imagine a
 compressed stream like ZipStream that knows its length but isn't
 seekable.
quod erat demonstrandum. :)
au contraire, existence of one does doesn't imply non-existence of the other. :-) Actually now that I think about it some more a program that extracts records from a Zip file most likely skips immediately the file position with the file to extract and reads from there. I looked at std.zip and it only works on data in memory but it essentially does that.
 I told you! seakable streams are nonsense. Even for "simple" cases
 like text files.... Text streams in different encodings  by
 definition has no "postion" As stream output may depend on physical
 byte number ***and*** previous state of the stream.
position is defined by byte position, not character position (unless the encoding is dchars).
 Just think about it. Have you ever use any stream with positioning 
 in your practice? It is either pure stream (getNextChar) or sort of
  block-oriented access like fread/fwrite. But not their mix.
yes I've used positioning with MATLAB many times. Finding and reading chunks of a file without having to read from the start is handy.
Files (ex. on disk) can be opened for serial access, or random access. You can also read some, and then skip a number of bytes -- this can be done both with files opened for serial or random access, and also with streams. But positioning with a stream, that's not what a stream implementation should do. Neither trying to get the size. The application _using_ your stream is of course free to pretend it can position. But that has to be done with opening your stream and just skipping (i.e. reading and discarding values) till the app is happy with the "position". In general, streams should only do stuff that's "within the stream concept". For all we know, the stream could become connected to the keyboard, and then there is no way of knowing in advance how much or when Georg is going to type before he gets fed-up. Right? ---- It's like a programmer creates an array implementation. And while he's at it he writes the methods for FIFO, LIFO, Priority Queue, sorting, circular buffer, binary search, and whatever -- all directly in the array. Wouldn't it be more practical to just have the basic operations of arrays in it, and then have other people implement the FIFO, etc. (Be it by inheritance or client classes or just procedural code that uses the array.) A stream should do stream stuff only. (That's what I meant with the OSI model.)
Apr 18 2005
parent "Ben Hinkle" <bhinkle mathworks.com> writes:
 Just think about it. Have you ever use any stream with positioning in 
 your practice? It is either pure stream (getNextChar) or sort of
  block-oriented access like fread/fwrite. But not their mix.
yes I've used positioning with MATLAB many times. Finding and reading chunks of a file without having to read from the start is handy.
Files (ex. on disk) can be opened for serial access, or random access. You can also read some, and then skip a number of bytes -- this can be done both with files opened for serial or random access, and also with streams. But positioning with a stream, that's not what a stream implementation should do. Neither trying to get the size.
I can see how the word "stream" could implies a bunch of data flowing by (or being generated) like water. But unfortunately that's that accepted term that is now used to include files and pipes and sockets etc. Since files (and memory streams) can be random-access and can have a size it is more practical to allow some form of "seekable streams" rather than create a different class and API just for random-access files.
 The application _using_ your stream is of course free to pretend it can 
 position. But that has to be done with opening your stream and just 
 skipping (i.e. reading and discarding values) till the app is happy with 
 the "position".
That depends on the stream. Also what you describe would only allow the position to be set to something further along the stream instead of anywhere in the stream.
 In general, streams should only do stuff that's "within the stream 
 concept". For all we know, the stream could become connected to the 
 keyboard, and then there is no way of knowing in advance how much or when 
 Georg is going to type before he gets fed-up. Right?
It is funny that after all these years of having computers around we humans still haven't figured out how to best deal with files and pipes and sockets. You'd think that API would have settled down in the 70's.
Apr 18 2005
prev sibling parent Ben Hinkle <Ben_member pathlink.com> writes:
In article <4262A961.9040102 nospam.org>, Georg Wrede says...
Size() implies seekability.

Someone using size() on non-seekable streams is making a programmer 
error, IMHO. My suggestion is a non-quenchable error.
Sounds reasonable - except I'll leave the non-quenchable part up to the application :-P. The default size() will throw a SeekException if the stream is not seekable. Subclasses can override size() is they want to do something else. That is more backwards-compatible anyway since the only non-SeekException-throwing streams were pipes on Windows.
Apr 17 2005
prev sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 17 Apr 2005 12:23:32 -0400, Ben Hinkle <ben.hinkle gmail.com>  
wrote:
 What should size() of a non-seekable stream return or do?
If the stream knows (or can get) a correct size then it should return it. If the stream does not know and/or cannot get a correct size it should throw an exception (or return a value meaning "indeterminate") By "size" I am referring to the total number of bytes the stream will _ever_ contain, not "available" the number of bytes it _currently_ has available. (the two may be equal in some cases) I'm not sure the exception should be a SeekException as technically I don't think being able to get a size has anything to do with seekability. Technically it's possible to have a stream which knows how long it is, but is not seekable. However, practically, I'm struggling to think of an example.
 Currently it
 depends on the stream type: for a general stream it throws a  
 SeekException
 and for a File on Windows it returns 0 (which is just what GetFileSize
 returns for non-seekable streams like pipes).
For things like pipes, unless they've closed (and you've buffered the data from them) you wont really know the "size".
 I'm tempted to have it return
 ulong.max. Any objections?
Which would mean the size is what? What if someone assumes the size is correct and allocates that many bytes to read into? I reckon you need to return a value which means "indeterminate". i.e. -1 or something.
 While I'm at it I'm making eof testing more efficient for both seekable  
 and
 non-seekable streams by using the convention that if readBlock returns 0
 then the stream is at eof (and I'd like to document that). Technically  
 that
 wasn't part of the existing readBlock's documentation but it's what  
 happens
 in practice and it comes in handy with non-seekable streams.
So "eof" will call readBlock? Or when readBlock returns 0 you'll set a flag which "eof" will check? Regan
Apr 17 2005
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Regan Heath wrote:
 On Sun, 17 Apr 2005 12:23:32 -0400, Ben Hinkle <ben.hinkle gmail.com>  
 wrote:
 
 What should size() of a non-seekable stream return or do?
If the stream knows (or can get) a correct size then it should return it.
Isn't the whole concept of stream (as opposed to file) precisely the idea that you should not rely on any "knowledge" of it -- beyond what you've already got! Think of the OSI model, and keep the stream implementation focused.
Apr 18 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 18 Apr 2005 12:59:20 +0300, Georg Wrede <georg.wrede nospam.org>  
wrote:
 Regan Heath wrote:
 On Sun, 17 Apr 2005 12:23:32 -0400, Ben Hinkle <ben.hinkle gmail.com>   
 wrote:

 What should size() of a non-seekable stream return or do?
If the stream knows (or can get) a correct size then it should return it.
Isn't the whole concept of stream (as opposed to file) precisely the idea that you should not rely on any "knowledge" of it -- beyond what you've already got!
In that case no stream should implement "size", but only "available". Where "size" means maximum number of bytes in this 'thing' and "available" means number of bytes in it _now_.
 Think of the OSI model, and keep the stream implementation focused.
OSI? Regan
Apr 18 2005
parent Georg Wrede <georg.wrede nospam.org> writes:
Regan Heath wrote:
 On Mon, 18 Apr 2005 12:59:20 +0300, Georg Wrede 
 <georg.wrede nospam.org>  wrote:
 
 Regan Heath wrote:

 On Sun, 17 Apr 2005 12:23:32 -0400, Ben Hinkle 
 <ben.hinkle gmail.com>   wrote:

 What should size() of a non-seekable stream return or do?
If the stream knows (or can get) a correct size then it should return it.
Isn't the whole concept of stream (as opposed to file) precisely the idea that you should not rely on any "knowledge" of it -- beyond what you've already got!
In that case no stream should implement "size", but only "available". Where "size" means maximum number of bytes in this 'thing' and "available" means number of bytes in it _now_.
 Think of the OSI model, and keep the stream implementation focused.
OSI? Regan
http://www.webopedia.com/quick_ref/OSI_Layers.asp
Apr 18 2005
prev sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
 Isn't the whole concept of stream (as opposed to file) precisely the idea 
 that you should not rely on any "knowledge" of it -- beyond what you've 
 already got!
I agree with Georg. on 100%. And one more. Stream is stream. Vector is vector. You can build stream on top of vector but not vice versa. This is why for the sake of logical simplicity/clearness they should not be combined in one entitity. Andrew.
Apr 18 2005