www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - VOTE: mango.io => std.io

reply Unknown lurker <Unknown_member pathlink.com> writes:
Maybe we should have cast a vote to convince Walter to deprecate std.stream and
add mango.io to phobos. Please reply only with +1 / -1 / 0.

+1
Jun 24 2004
next sibling parent "Zz" <Zz Zz.com> writes:
+1
"Unknown lurker" <Unknown_member pathlink.com> wrote in message
news:cbepgc$1bbh$1 digitaldaemon.com...
 Maybe we should have cast a vote to convince Walter to deprecate

 add mango.io to phobos. Please reply only with +1 / -1 / 0.

 +1

Jun 24 2004
prev sibling next sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Unknown lurker wrote:

 Maybe we should have cast a vote to convince Walter to deprecate
 std.stream and add mango.io to phobos. Please reply only with +1 / -1 / 0.
 
 +1

I'd first like to see what Sean does with std.stream plus I tend to agree with Matthew that more discussion is needed before jumping too soon in any one direction. There's a lot of cool stuff in mango that I'd love to see somehow merged with std.stream if possible. Maybe I'll give a poke at porting Kris's tokenizers and endian stuff to std.stream just to see what it looks like. This will probably open up rat-holes, but two quick examples of things to discuss: 1) in mango it looks like to open a file and read it you need to create a FileConduit and pass that to a Reader constructor. So you have to grok the difference between Conduits and Readers/Writers (and maybe Buffers? I notice IConduit has a createBuffer method so is it not buffered by default? I'm not sure). In std.stream you make one object and there is less to grok. The flexibility of mango is probably nice but it adds complexity. Each person has a different notion of where to draw the boundaries. 2) in mango to use object serialization/deserialization you register an instance of a class so that means at startup you basically have to instantiate one of every class that might want to be deserialized. Seems wastful and it could affect class design to avoid having classes that have interdependencies.
Jun 24 2004
next sibling parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <cbf0lp$1lvi$1 digitaldaemon.com>, Ben Hinkle says...

I'd first like to see what Sean does with std.stream plus I tend to agree
with Matthew that more discussion is needed before jumping too soon in any
one direction. There's a lot of cool stuff in mango that I'd love to see
somehow merged with std.stream if possible. Maybe I'll give a poke at
porting Kris's tokenizers and endian stuff to std.stream just to see what
it looks like.

I'd be quite happy if std.stream were to be improved. Here are some suggestions. You'll probably think that many of them are trivial, but each, in their own way, contributes a small amount of annoyance, and I'm sure these things could be easily got rid of. 1) Since it is more normal to want buffered file access than non-buffered file access (in C, fopen() is called more often than open()), it makes sense that File should be buffered by default, and there should be a separate class, maybe called RawFileStream or something, for the unbuffered case. 2) File should in any case be renamed FileStream 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and Filemode.OUT respectively. 4) It should be possible to construct a File object in create mode, in one step. As in File f = new File(filename, FileMode.CREATE); 5) In fact, all possible combinations of file opening supported by fopen() should be supported by File. It should be possible to assert that the file does or does not exist before opening it (atomically), to truncate or not truncate, to position the file pointer at the start or end of the file, to allow append-only access, etc. 6) The destructor should always close the file 7) EITHER Stream classes should be auto (likely to be an unpopular suggestion, I know), OR there should be an auto wrapper class that you can construct from a Stream, in order to guarantee that the file will be closed in the event of an exception (which could of course be thrown by ANY piece of code). Currently we have to either roll our own auto wrapper, or use a try/catch block. 8) Documentation should be complete and accurate. 9) There should be a FilterStream class, from which BufferedStream inherits, so that we can write our own stream filters. (Java does this. It's neat). 10) Streams don't necessarily have to do transcoding (see - I learnt a new word), but nonetheless it should be POSSIBLE to construct them from a Reader/Writer in order to make such extensions possible in the future. 11) I want the function available(), as Java has. A buffered stream always knows how much it's got left in its buffer, and I have no problem with an unbuffered stream returning zero. 12) stdin, stdout and stderr should be globally available D streams. (Maybe they are already, but point (8) means there's a lot I don't know about existing capabilities) 13) Streams should overload the << and >> operators. (Someone suggested using ~. That would be fine too). None of these is particularly difficult in and of itself, but together they add up to a frustrating gripe list. But I'm fairly confident that if these flaws are fixed (along with any other gripes which others may mention in the course of this thread) then I imagine that most people will be pretty happy with new improved std.stream.
This will probably open up rat-holes, but two quick examples of things to
discuss:

1) in mango it looks like to open a file and read it you need to create a
FileConduit and pass that to a Reader constructor. So you have to grok the
difference between Conduits and Readers/Writers (and maybe Buffers? I
notice IConduit has a createBuffer method so is it not buffered by default?
I'm not sure). In std.stream you make one object and there is less to grok.
The flexibility of mango is probably nice but it adds complexity. Each
person has a different notion of where to draw the boundaries.

But there is logic behind it. Currently, D does no transcoding - that is, writeLine() will spit out raw UTF-8. Now that's fine if your output is going to a text file, but if it's going to a console, you're screwed. Now you COULD simplify this a bit by "automatically" encoding the output in the operating system default encoding - but that would just reverse the problem. Now, output to the console would be fine, but output destined to leave your machine and end up on someone else's machine (e.g. text file, socket, etc.) would also be similarly munged. UTF-8 is pretty much the best portable format, so ideally you only want to encode at the last minute, just before the stream hits the user.
2) in mango to use object serialization/deserialization you register an
instance of a class so that means at startup you basically have to
instantiate one of every class that might want to be deserialized. Seems
wastful and it could affect class design to avoid having classes that have
interdependencies.

I'm not convinced that serialization necessarily has anything to do with streams. You could serialize to a string, or an in-memory buffer. I guess that would be faster for small objects but disadvantageous for very large ones. In any case, you don't need to decide on a firm serialization policy in order to make streams feel nice. That can come later, once we're happy with the basics. Arcane Jill
Jun 24 2004
next sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <cbf8jg$221b$1 digitaldaemon.com>, Arcane Jill says...
I'd be quite happy if std.stream were to be improved. Here are some suggestions.
You'll probably think that many of them are trivial, but each, in their own way,
contributes a small amount of annoyance, and I'm sure these things could be
easily got rid of.

1) Since it is more normal to want buffered file access than non-buffered file
access (in C, fopen() is called more often than open()), it makes sense that
File should be buffered by default, and there should be a separate class, maybe
called RawFileStream or something, for the unbuffered case.

I was actually going to take a different approach and modify BufferedStream like so: BuferedStream( BaseStream ) : BaseStream { // override the low-level i/o methods to do buffering } So a buffered file stream would be: alias BufferedStream!(FileStream) BufferedFileStream; But as you say, file i/o is almost always buffered, so it may make sense to change the name of "FileStream" to "UnbufferedFileStream" and thus make buffered file i/o the default.
2) File should in any case be renamed FileStream

3) FileMode.In and FileMode.Out should be renamed Filemode.IN and Filemode.OUT
respectively.

Both already done :) Well, it was going to be Stream.IN and Stream.OUT, but same thing.
5) In fact, all possible combinations of file opening supported by fopen()
should be supported by File. It should be possible to assert that the file does
or does not exist before opening it (atomically), to truncate or not truncate,
to position the file pointer at the start or end of the file, to allow
append-only access, etc.

Right now the file stuff uses CreateFile in Windows. Would it be better to use fopen and the other ANSI calls instead?
7) EITHER Stream classes should be auto (likely to be an unpopular suggestion, I
know), OR there should be an auto wrapper class that you can construct from a
Stream, in order to guarantee that the file will be closed in the event of an
exception (which could of course be thrown by ANY piece of code). Currently we
have to either roll our own auto wrapper, or use a try/catch block.

Interesting idea. This may be another good template wrapper: auto class MakeAuto( BaseClass ) : BaseClass {}
9) There should be a FilterStream class, from which BufferedStream inherits, so
that we can write our own stream filters. (Java does this. It's neat).

10) Streams don't necessarily have to do transcoding (see - I learnt a new
word), but nonetheless it should be POSSIBLE to construct them from a
Reader/Writer in order to make such extensions possible in the future.

This kid of stuff should be saved for a later discussion. All great ideas but they're the tip of a rather large iceberg.
11) I want the function available(), as Java has. A buffered stream always knows
how much it's got left in its buffer, and I have no problem with an unbuffered
stream returning zero.

Easy enough. I was going to add this to the BufferedStream class, though perhaps it would be useful everywhere?
12) stdin, stdout and stderr should be globally available D streams. (Maybe they
are already, but point (8) means there's a lot I don't know about existing
capabilities)

I think they are. I kind of consider Phobos to still be in the state where looking at the source files is best way to find out what's available. Doxygen or other documentation is crucial, but I have a feeling that the stream API will be in flux for quite some time yet.
13) Streams should overload the << and >> operators. (Someone suggested using ~.
That would be fine too).

Overloading ~ wouldn't work I'm afraid, unless there's something I'm missing. Say I have a FileStream and I want to both read and write from it. How do I know which I want to do if I'm using the same operator for both?
None of these is particularly difficult in and of itself, but together they add
up to a frustrating gripe list. But I'm fairly confident that if these flaws are
fixed (along with any other gripes which others may mention in the course of
this thread) then I imagine that most people will be pretty happy with new
improved std.stream.

I agree. But as D is still pretty early in its development I don't really want people to be happy with anything if it means losing constructive dialog. I'm willing to sacrifice productivity in the short term if it means a better library in the long term.
But there is logic behind it. Currently, D does no transcoding - that is,
writeLine() will spit out raw UTF-8. Now that's fine if your output is going to
a text file, but if it's going to a console, you're screwed. Now you COULD
simplify this a bit by "automatically" encoding the output in the operating
system default encoding - but that would just reverse the problem. Now, output
to the console would be fine, but output destined to leave your machine and end
up on someone else's machine (e.g. text file, socket, etc.) would also be
similarly munged. UTF-8 is pretty much the best portable format, so ideally you
only want to encode at the last minute, just before the stream hits the user.

This is the start of a rather long discussion. I've had enough issues come up with formatted i/o that I'm going to leave that aspect rather bare and see what develops. For the moment, I'm starting to think that handling plain ol' ASCII is probably enough until the rest can be worked out. Sean
Jun 24 2004
next sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Thu, 24 Jun 2004 20:32:52 +0000 (UTC), Sean Kelly <sean f4.ca> wrote:
 In article <cbf8jg$221b$1 digitaldaemon.com>, Arcane Jill says...
 I'd be quite happy if std.stream were to be improved. Here are some 
 suggestions.
 You'll probably think that many of them are trivial, but each, in their 
 own way,
 contributes a small amount of annoyance, and I'm sure these things 
 could be
 easily got rid of.

 1) Since it is more normal to want buffered file access than 
 non-buffered file
 access (in C, fopen() is called more often than open()), it makes sense 
 that
 File should be buffered by default, and there should be a separate 
 class, maybe
 called RawFileStream or something, for the unbuffered case.

I was actually going to take a different approach and modify BufferedStream like so: BuferedStream( BaseStream ) : BaseStream { // override the low-level i/o methods to do buffering } So a buffered file stream would be: alias BufferedStream!(FileStream) BufferedFileStream;

I agree. This seems most logical to me.
 But as you say, file i/o is almost always buffered, so it may make sense 
 to
 change the name of "FileStream" to "UnbufferedFileStream" and thus make 
 buffered
 file i/o the default.

Perhaps.. Files may be an exception to the rule, but, if you can handle that exception as you have shown above, at no cost, then why not.
 2) File should in any case be renamed FileStream

 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and 
 Filemode.OUT
 respectively.

Both already done :) Well, it was going to be Stream.IN and Stream.OUT, but same thing.

Excellent. Generic names are good. i.e. you could have a template that too a stream type File, Socket etc, and pass the same IN OUT etc constants.
 5) In fact, all possible combinations of file opening supported by 
 fopen()
 should be supported by File. It should be possible to assert that the 
 file does
 or does not exist before opening it (atomically), to truncate or not 
 truncate,
 to position the file pointer at the start or end of the file, to allow
 append-only access, etc.

Right now the file stuff uses CreateFile in Windows. Would it be better to use fopen and the other ANSI calls instead?

No.. well it depends if you want to do the buffering yourself i.e. using D arrays etc or make use of the existing fopen buffering. If CreateFile on windows and open on unix with your own buffering is more efficient then go that way.
 7) EITHER Stream classes should be auto (likely to be an unpopular 
 suggestion, I
 know), OR there should be an auto wrapper class that you can construct 
 from a
 Stream, in order to guarantee that the file will be closed in the event 
 of an
 exception (which could of course be thrown by ANY piece of code). 
 Currently we
 have to either roll our own auto wrapper, or use a try/catch block.

Interesting idea. This may be another good template wrapper: auto class MakeAuto( BaseClass ) : BaseClass {}
 9) There should be a FilterStream class, from which BufferedStream 
 inherits, so
 that we can write our own stream filters. (Java does this. It's neat).

 10) Streams don't necessarily have to do transcoding (see - I learnt a 
 new
 word), but nonetheless it should be POSSIBLE to construct them from a
 Reader/Writer in order to make such extensions possible in the future.

This kid of stuff should be saved for a later discussion. All great ideas but they're the tip of a rather large iceberg.
 11) I want the function available(), as Java has. A buffered stream 
 always knows
 how much it's got left in its buffer, and I have no problem with an 
 unbuffered
 stream returning zero.

Easy enough. I was going to add this to the BufferedStream class, though perhaps it would be useful everywhere?
 12) stdin, stdout and stderr should be globally available D streams. 
 (Maybe they
 are already, but point (8) means there's a lot I don't know about 
 existing
 capabilities)

I think they are. I kind of consider Phobos to still be in the state where looking at the source files is best way to find out what's available. Doxygen or other documentation is crucial, but I have a feeling that the stream API will be in flux for quite some time yet.
 13) Streams should overload the << and >> operators. (Someone suggested 
 using ~.
 That would be fine too).

Overloading ~ wouldn't work I'm afraid, unless there's something I'm missing. Say I have a FileStream and I want to both read and write from it. How do I know which I want to do if I'm using the same operator for both?
 None of these is particularly difficult in and of itself, but together 
 they add
 up to a frustrating gripe list. But I'm fairly confident that if these 
 flaws are
 fixed (along with any other gripes which others may mention in the 
 course of
 this thread) then I imagine that most people will be pretty happy with 
 new
 improved std.stream.

I agree. But as D is still pretty early in its development I don't really want people to be happy with anything if it means losing constructive dialog. I'm willing to sacrifice productivity in the short term if it means a better library in the long term.
 But there is logic behind it. Currently, D does no transcoding - that 
 is,
 writeLine() will spit out raw UTF-8. Now that's fine if your output is 
 going to
 a text file, but if it's going to a console, you're screwed. Now you 
 COULD
 simplify this a bit by "automatically" encoding the output in the 
 operating
 system default encoding - but that would just reverse the problem. Now, 
 output
 to the console would be fine, but output destined to leave your machine 
 and end
 up on someone else's machine (e.g. text file, socket, etc.) would also 
 be
 similarly munged. UTF-8 is pretty much the best portable format, so 
 ideally you
 only want to encode at the last minute, just before the stream hits the 
 user.

This is the start of a rather long discussion. I've had enough issues come up with formatted i/o that I'm going to leave that aspect rather bare and see what develops. For the moment, I'm starting to think that handling plain ol' ASCII is probably enough until the rest can be worked out.

What do you think of my filters idea, as long as you can snap any number of filters to streams and each other your data will be transcoded etc from one end to the other, and back again in the other direction. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 24 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr94etton5a2sq9 digitalmars.com>, Regan Heath says...

What do you think of my filters idea, as long as you can snap any number 
of filters to streams and each other your data will be transcoded etc from 
one end to the other, and back again in the other direction.

Regan, your filters are /almost/ the same idea as mango's Readers/Writers. We're pretty much talking the same thing here, only by a different name. But there is nonetheless a very important difference between the two concepts, which you may have missed. This is that a character sequence is a sequence of 32-bit-wide dchars, wheras a traditional stream is a sequence of 8-bit-wide bytes. So, at some stage, you need a "filter" which converts from ubyte[] to dchar[]. Such filters do not chain, because the output from one will not be the same type as the input to the next. Now, you COULD insist that everything be done on an 8-bit stream (mandating UTF-8 as the format for actual characters), but there is an efficiency issue there. UTF-32 is always going to be faster to process than UTF-8. Besides which - you don't NEED a chain of filters when transcoding in D, because one end WILL be Unicode, always. So I'd say the ideal situation would be: (1) Reader classes which convert ubtyes from a stream (of known encoding) into dchars (Unicode). You'd need one Reader for each encoding standard. (2) Writer classes which convert dchars (Unicode) into ubytes (of some known encoding) to be sent to a stream (again, one for each encoding standard) (3) Filters, as described by you, which convert ubytes into more ubytes, and can do completely arbitrary things. But I don't think your 8-bit-wide filters should be trying to handle dchars. That's a different job. I think the above would give you maximum flexibility, however, without losing any efficiency. What do you think? Arcane Jill
Jun 25 2004
next sibling parent Sam McCall <tunah.d tunah.net> writes:
Arcane Jill wrote:
 (1) Reader classes which convert ubtyes from a stream (of known encoding) into
 dchars (Unicode). You'd need one Reader for each encoding standard.
 
 (2) Writer classes which convert dchars (Unicode) into ubytes (of some known
 encoding) to be sent to a stream (again, one for each encoding standard)
 
 (3) Filters, as described by you, which convert ubytes into more ubytes, and
can
 do completely arbitrary things.
 

neccesary for completeness, if nothing else ;-) But some things like a pushback filter that supports "unreading" might be applicable at the byte level in some situations and the character level in others. Sam
Jun 25 2004
prev sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 07:23:28 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:
 In article <opr94etton5a2sq9 digitalmars.com>, Regan Heath says...

 What do you think of my filters idea, as long as you can snap any number
 of filters to streams and each other your data will be transcoded etc 
 from
 one end to the other, and back again in the other direction.

Regan, your filters are /almost/ the same idea as mango's Readers/Writers. We're pretty much talking the same thing here, only by a different name.

I suspected as much.
 But there is nonetheless a very important difference between the two 
 concepts,
 which you may have missed. This is that a character sequence is a 
 sequence of
 32-bit-wide dchars, wheras a traditional stream is a sequence of 
 8-bit-wide
 bytes.

Yep. ok.
 So, at some stage, you need a "filter" which converts from ubyte[] to
 dchar[].

Yep..
 Such filters do not chain, because the output from one will not be the
 same type as the input to the next.

Who said all filters have to have the same input/ouput types, you could have a ubyte[] to dchar[] filter, and another filter foo which went from dchar[] to dchar[] doing something to it, like uppercasing it all or whatever.
 Now, you COULD insist that everything be
 done on an 8-bit stream (mandating UTF-8 as the format for actual 
 characters),
 but there is an efficiency issue there. UTF-32 is always going to be 
 faster to
 process than UTF-8.

I am not simply thinking of characters here, I am thinking in terms of an 8-bit stream of data, which may represent characters, but may represent something else entirely. i.e. a bzip filter for a raw data stream, you plug it into your FileStream and hey presto you have bzipped or bunzipped a file.
 Besides which - you don't NEED a chain of filters when transcoding in D, 
 because
 one end WILL be Unicode, always.

So.. if I'm writing the binary representation of a structure out to disk which end is unicode?
 So I'd say the ideal situation would be:

 (1) Reader classes which convert ubtyes from a stream (of known 
 encoding) into
 dchars (Unicode). You'd need one Reader for each encoding standard.

Only if you're reading text.. surely?
 (2) Writer classes which convert dchars (Unicode) into ubytes (of some 
 known
 encoding) to be sent to a stream (again, one for each encoding standard)

sure, if you're writing text.
 (3) Filters, as described by you, which convert ubytes into more ubytes, 
 and can
 do completely arbitrary things.

Yep. Perfect.
 But I don't think your 8-bit-wide filters should be trying to handle 
 dchars.
 That's a different job. I think the above would give you maximum 
 flexibility,
 however, without losing any efficiency. What do you think?

I agree with pretty much all you say. I think we're just thinking about it with different priorities in mind, perhaps due to your recent excursion into unicode? :oÞ Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr95c05pg5a2sq9 digitalmars.com>, Regan Heath says...
 What do you think of my filters idea, as long as you can snap any number
 of filters to streams and each other your data will be transcoded etc 
 from
 one end to the other, and back again in the other direction.



The concept of an arbitrary byte-sequence to byte-sequence filter is staggeringly useful, no questions. Such a filter could zip, unzip, transcode from one eight-byte standard to another, and so on, just as you say. But such a filter *IS ITSELF A STREAM*. In Java, you would make one by inheriting from FilterStream. Presumably, we'll be able to do that in D, too, fairly soon. But even without that, it suffices to inherit from std.stream.Stream, and simply override either readBlock() or writeBlock() (or both) as appropriate, and bingo - one arbitrary byte-sequence filter. In response to such a stream's readBlock() function being called, you get bytes as required from the underlying stream by calling /its/ readBlock() function, perform whatever filtering/buffering was required, and then return as many bytes as were requested (or fewer), buffering the rest. Similarly, a Writer could derive from Stream, as it is a special case of an OutputStream, in that it delivers a stream of bytes. It is not a FilterStream, however, as its source is not a Stream. It kind of stops there though. A Reader is not an InputStream, because it doesn't return a sequence of bytes. (It returns a sequence of dchars). However, it could use an InputStream as it's SOURCE, since it need a sequence of bytes as its input. dchar-sequence to dchar-sequence are another kettle of fish altogether. It would seem that the most general solution would be that Stream should be an alias for a template class TStream!(ubyte). Such a template could provide interfaces TStream!(ubyte).InputStream and TStream!(ubyte).OutputStream. In this context, a Stream would implement TStream!(ubyte).InputStream and TStream!(ubyte).OutputStream; a Reader would implement TStream!(ubyte).InputStream and TStream!(dchar).OutputStream; a Writer would implement TStream!(dchar).InputStream and TStream!(ubyte).OutputStream; and a dchar-to-dchar filter would implement TStream!(dchar).InputStream and TStream!(dchar).OutputStream. Now THAT's how I'd LIKE to see it done. Arcane Jill
Jun 25 2004
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <cbheg0$280c$1 digitaldaemon.com>, Arcane Jill says...
It would seem that the most general solution would be that Stream should be an
alias for a template class TStream!(ubyte). Such a template could provide
interfaces TStream!(ubyte).InputStream and TStream!(ubyte).OutputStream.

You beat me to the punch--I was going to bring this up once we got back into talk about formatting :) I had been trying to avoid templates for char type, but it caused too many problems with formatted i/o. I'm not going to do anything with this for now, but I do think that this general idea is a good one. Sean
Jun 25 2004
prev sibling next sibling parent Daniel Horn <hellcatv hotmail.com> writes:
Go-Go Templates

Arcane Jill wrote:
 In article <opr95c05pg5a2sq9 digitalmars.com>, Regan Heath says...
 
What do you think of my filters idea, as long as you can snap any number
of filters to streams and each other your data will be transcoded etc 
from
one end to the other, and back again in the other direction.



The concept of an arbitrary byte-sequence to byte-sequence filter is staggeringly useful, no questions. Such a filter could zip, unzip, transcode from one eight-byte standard to another, and so on, just as you say. But such a filter *IS ITSELF A STREAM*. In Java, you would make one by inheriting from FilterStream. Presumably, we'll be able to do that in D, too, fairly soon. But even without that, it suffices to inherit from std.stream.Stream, and simply override either readBlock() or writeBlock() (or both) as appropriate, and bingo - one arbitrary byte-sequence filter. In response to such a stream's readBlock() function being called, you get bytes as required from the underlying stream by calling /its/ readBlock() function, perform whatever filtering/buffering was required, and then return as many bytes as were requested (or fewer), buffering the rest. Similarly, a Writer could derive from Stream, as it is a special case of an OutputStream, in that it delivers a stream of bytes. It is not a FilterStream, however, as its source is not a Stream. It kind of stops there though. A Reader is not an InputStream, because it doesn't return a sequence of bytes. (It returns a sequence of dchars). However, it could use an InputStream as it's SOURCE, since it need a sequence of bytes as its input. dchar-sequence to dchar-sequence are another kettle of fish altogether. It would seem that the most general solution would be that Stream should be an alias for a template class TStream!(ubyte). Such a template could provide interfaces TStream!(ubyte).InputStream and TStream!(ubyte).OutputStream. In this context, a Stream would implement TStream!(ubyte).InputStream and TStream!(ubyte).OutputStream; a Reader would implement TStream!(ubyte).InputStream and TStream!(dchar).OutputStream; a Writer would implement TStream!(dchar).InputStream and TStream!(ubyte).OutputStream; and a dchar-to-dchar filter would implement TStream!(dchar).InputStream and TStream!(dchar).OutputStream. Now THAT's how I'd LIKE to see it done. Arcane Jill

Jun 25 2004
prev sibling next sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 14:57:04 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:

 In article <opr95c05pg5a2sq9 digitalmars.com>, Regan Heath says...
 What do you think of my filters idea, as long as you can snap any 
 number
 of filters to streams and each other your data will be transcoded etc
 from
 one end to the other, and back again in the other direction.



The concept of an arbitrary byte-sequence to byte-sequence filter is staggeringly useful, no questions. Such a filter could zip, unzip, transcode from one eight-byte standard to another, and so on, just as you say. But such a filter *IS ITSELF A STREAM*. In Java, you would make one by inheriting from FilterStream. Presumably, we'll be able to do that in D, too, fairly soon. But even without that, it suffices to inherit from std.stream.Stream, and simply override either readBlock() or writeBlock() (or both) as appropriate, and bingo - one arbitrary byte-sequence filter. In response to such a stream's readBlock() function being called, you get bytes as required from the underlying stream by calling /its/ readBlock() function, perform whatever filtering/buffering was required, and then return as many bytes as were requested (or fewer), buffering the rest.

Sure.. I was just calling them filters cos they did something to a data. Once we agree on terminology I think we'll find we all agree on how it should be done.
 Similarly, a Writer could derive from Stream, as it is a special case of 
 an
 OutputStream, in that it delivers a stream of bytes. It is not a 
 FilterStream,
 however, as its source is not a Stream.

 It kind of stops there though. A Reader is not an InputStream, because it
 doesn't return a sequence of bytes. (It returns a sequence of dchars). 
 However,
 it could use an InputStream as it's SOURCE, since it need a sequence of 
 bytes as
 its input.

I dont think I understand what a Writer and a Reader are exactly.. AFAICS you have a Source i.e. a file and a Sink i.e. a socket. The source and sink are not streams themselves, but you could and probably would wrap a stream interface around them. In fact designing the stream as a template would allow you to wrap it around anything that provided the requisite methods i.e. read() write() etc. Filters are, as you say, streams, doing this allows you to plug them into any other stream. Where does the Reader/Writer enter the picture? Are they simply filters (and thus streams) that convert from one data format to another? Or...
 dchar-sequence to dchar-sequence are another kettle of fish altogether.

By the same rationale as ubyte-sequence to ubyte-sequence aren't they a STREAM also?
 It would seem that the most general solution would be that Stream should 
 be an
 alias for a template class TStream!(ubyte). Such a template could provide
 interfaces TStream!(ubyte).InputStream and TStream!(ubyte).OutputStream.

 In this context, a Stream would implement TStream!(ubyte).InputStream and
 TStream!(ubyte).OutputStream; a Reader would implement
 TStream!(ubyte).InputStream and TStream!(dchar).OutputStream; a Writer 
 would
 implement TStream!(dchar).InputStream and TStream!(ubyte).OutputStream; 
 and a
 dchar-to-dchar filter would implement TStream!(dchar).InputStream and
 TStream!(dchar).OutputStream.

 Now THAT's how I'd LIKE to see it done.

It sounds pretty solid to me. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr958zkem5a2sq9 digitalmars.com>, Regan Heath says...
I dont think I understand what a Writer and a Reader are exactly..

AFAICS you have a Source i.e. a file and a Sink i.e. a socket. The source 
and sink are not streams themselves, but you could and probably would wrap 
a stream interface around them. In fact designing the stream as a template 
would allow you to wrap it around anything that provided the requisite 
methods i.e. read() write() etc.

Where does the Reader/Writer enter the picture? Are they simply filters 
(and thus streams) that convert from one data format to another? Or...

 dchar-sequence to dchar-sequence are another kettle of fish altogether.

By the same rationale as ubyte-sequence to ubyte-sequence aren't they a STREAM also?

Just to clarify, it is generally understood that in order to considered a stream, the smallest unit you can read or write has to be ONE BYTE. A Reader is simply something which has a read() function, but for which the smallest unit it can read is ONE DCHAR. Similarly, a Writer is simply something which has a write() function, but for which the smallest unit it can write is one dchar. Readers/Writers are often not considered streams, for this reason. There are a number of good reasons why this makes sense. If you write a dchar to a socket, for example, you would have to worry about the endianness of the thing. Should you send in machine byte order? Network byte order? And - just because you only write in four-byte chunks, that doesn't guarantee that what's at the other end of the socket won't read in one-byte chunks. In actual fact, if you squirt dchars into a stream, four bytes at a time, then this is generally considered to be an encoding in its own right. The encoding is called UTF-32LE if the bytes are in little-endian order, or UTF-32BE if the bytes are in big-endian order. So, by calling it a stream, you've artificially added a new layer of encoding. In general, anything wider than a byte is not suitable for a stream (as such) because of byte-ordering issues. A Reader therefore converts a stream OF BYTES into dchars for some internal use. That is, once you've got your dchars, they stay that way, and are dealt with as such by your application. Conversely with Writer. We could call all of these things streams and leave it at that, of course, but that doesn't change the underlying problem, which is that a file, or socket (etc.) has no knowledge of character encoding standards, and therefore, conceptually, cannot store character - only bytes. To interpret those bytes as characters, you need to know the encoding standard. (There are hueristic algorithms which can take a good guess, of course, but that's beside the point). Does that help? Arcane Jill
Jun 26 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Sat, 26 Jun 2004 07:52:11 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:
 In article <opr958zkem5a2sq9 digitalmars.com>, Regan Heath says...
 I dont think I understand what a Writer and a Reader are exactly..

 AFAICS you have a Source i.e. a file and a Sink i.e. a socket. The 
 source
 and sink are not streams themselves, but you could and probably would 
 wrap
 a stream interface around them. In fact designing the stream as a 
 template
 would allow you to wrap it around anything that provided the requisite
 methods i.e. read() write() etc.

 Where does the Reader/Writer enter the picture? Are they simply filters
 (and thus streams) that convert from one data format to another? Or...

 dchar-sequence to dchar-sequence are another kettle of fish altogether.

By the same rationale as ubyte-sequence to ubyte-sequence aren't they a STREAM also?

Just to clarify, it is generally understood that in order to considered a stream, the smallest unit you can read or write has to be ONE BYTE. A Reader is simply something which has a read() function, but for which the smallest unit it can read is ONE DCHAR. Similarly, a Writer is simply something which has a write() function, but for which the smallest unit it can write is one dchar. Readers/Writers are often not considered streams, for this reason. There are a number of good reasons why this makes sense. If you write a dchar to a socket, for example, you would have to worry about the endianness of the thing. Should you send in machine byte order? Network byte order? And - just because you only write in four-byte chunks, that doesn't guarantee that what's at the other end of the socket won't read in one-byte chunks. In actual fact, if you squirt dchars into a stream, four bytes at a time, then this is generally considered to be an encoding in its own right. The encoding is called UTF-32LE if the bytes are in little-endian order, or UTF-32BE if the bytes are in big-endian order. So, by calling it a stream, you've artificially added a new layer of encoding. In general, anything wider than a byte is not suitable for a stream (as such) because of byte-ordering issues. A Reader therefore converts a stream OF BYTES into dchars for some internal use. That is, once you've got your dchars, they stay that way, and are dealt with as such by your application. Conversely with Writer. We could call all of these things streams and leave it at that, of course, but that doesn't change the underlying problem, which is that a file, or socket (etc.) has no knowledge of character encoding standards, and therefore, conceptually, cannot store character - only bytes. To interpret those bytes as characters, you need to know the encoding standard. (There are hueristic algorithms which can take a good guess, of course, but that's beside the point). Does that help?

Yes indeed. I dont think you need something called a 'Reader' or something called a 'Writer' IMO they are both filters. You said filters are streams, well, these are filters, but not streams, so I say most but not all filters are streams. I think the simplest concepts which deal with this are the best to use, and IMO they are: Stream - read() write() ubytes. Filter - convert from one thing to another. So you have Streams, and then filters applied to them, when you want to read unicode text from a socket, you simply attach your unicode filter to the socket stream. As long as the other end used the same unicode filter to encode the data into the stream yours will decode it correctly. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 26 2004
prev sibling parent reply Andy Friesen <andy ikagames.com> writes:
Arcane Jill wrote:
 In this context, a Stream would implement TStream!(ubyte).InputStream and
 TStream!(ubyte).OutputStream; a Reader would implement
 TStream!(ubyte).InputStream and TStream!(dchar).OutputStream; a Writer would
 implement TStream!(dchar).InputStream and TStream!(ubyte).OutputStream; and a
 dchar-to-dchar filter would implement TStream!(dchar).InputStream and
 TStream!(dchar).OutputStream.
 
 Now THAT's how I'd LIKE to see it done.

Why parameterize Stream? It seems much simpler to me if all Streams do is read and write arrays of bytes. Readers and Writers convert bytes to or frome other formats. (floats, strings, xml, whatever) Filters could merely be decorator objects which wrap a Stream, Reader, or Writer and do some useful permutation. As a convenience, Readers and Writers could have constructors which accept filenames, saving the programmer the trouble of creating the stream separately. -- andy
Jun 25 2004
parent Arcane Jill <Arcane_member pathlink.com> writes:
In article <cbiarm$hhj$1 digitaldaemon.com>, Andy Friesen says...

Why parameterize Stream?  It seems much simpler to me if all Streams do 
is read and write arrays of bytes.  Readers and Writers convert bytes to 
or frome other formats. (floats, strings, xml, whatever)  Filters could 
merely be decorator objects which wrap a Stream, Reader, or Writer and 
do some useful permutation.

Fair enough. That's another way of doing it. The only difference really is that a dchar-based stream MAY NOT read/write in anything other than multiples of four bytes. ALL reads/writes, for ALL functions, would have to guarantee this. Functions like read(byte), read(char), read(short), etc., would all have to either throw an exception, or (less safe) call read(dchar) and truncate the result. But yeah - it could be done that way too, just as easily. Maybe more easily. Jill
Jun 25 2004
prev sibling parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <cbfdpk$2aku$1 digitaldaemon.com>, Sean Kelly says...

5) In fact, all possible combinations of file opening supported by fopen()
should be supported by File. It should be possible to assert that the file does
or does not exist before opening it (atomically), to truncate or not truncate,
to position the file pointer at the start or end of the file, to allow
append-only access, etc.

Right now the file stuff uses CreateFile in Windows. Would it be better to use fopen and the other ANSI calls instead?

Probably not, but I was talking about capabilities, not implementation. On the Windows platform, CreateFile() is presumably better because (a) fopen() is written in terms of CreateFile() anyway, (b) CreateFile() lets you open UNC pathnames, device-drivers (with paths starting with "//?/"), and so on. Also, I believe that CreateFile() can cope with NTFS "streams" (which is one of Microsoft's dumber ideas, but it's there) wheras fopen() can't. So, on the whole, I think you made the right choice in choosing the method with the most capabilities. The gripe, however, is that the most basic of those capabilities are not passed on to the Phobos user. You can't do, for example, the equivalent of fopen(filename, "a+"). From a user's point of view, fopen() is easy to use, and CreateFile() is hard to use. fopen() can be used blindfold, asleep, and/or half drunk, but CreateFile() requires a trip to the manual every time, and about ten lines of code. So ideally, you'd want the SIMPLICITY of fopen(), but the POWER of CreateFile(). Maybe that's too much to ask. In any case, if std.stream.Streams are less powerful than fopen(), then in many cases, we'll be forced to go back to using fopen(), fgets(), fread(), etc., simply because std.streams don't cut the mustard. fopen() functionality has to be MINIMAL functionality for std.streams. Oh - one other thing I forgot. I think we need functions like basename(), dirname(), pathinfo(), realpath() and so on, (stolen from PHP), and some function to append a pathname-component to a pathname. Of course, these things are dead easy to do with ordinary string manipulation ... IF you assume that the file separator is "/". But that won't work on a Mac. Such functions would let us manipulate pathnames in a platform independent way. (These should go in std.file, not std.stream, obviously). Arcane Jill
Jun 24 2004
next sibling parent reply Sam McCall <tunah.d tunah.net> writes:
Arcane Jill wrote:
 Oh - one other thing I forgot. I think we need functions like basename(),
 dirname(), pathinfo(), realpath() and so on, (stolen from PHP), and some
 function to append a pathname-component to a pathname. Of course, these things
 are dead easy to do with ordinary string manipulation ... IF you assume that
the
 file separator is "/". But that won't work on a Mac.

windows... unless you mean classic Mac OS, I don't think there are any plans to port a D compiler to that? Yes, these functions would be useful! Sam
Jun 25 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 21:43:46 +1200, Sam McCall <tunah.d tunah.net> wrote:

 Arcane Jill wrote:
 Oh - one other thing I forgot. I think we need functions like 
 basename(),
 dirname(), pathinfo(), realpath() and so on, (stolen from PHP), and some
 function to append a pathname-component to a pathname. Of course, these 
 things
 are dead easy to do with ordinary string manipulation ... IF you assume 
 that the
 file separator is "/". But that won't work on a Mac.

windows... unless you mean classic Mac OS, I don't think there are any plans to port a D compiler to that? Yes, these functions would be useful! Sam

C's stdlib.h on windows has: _fullpath - figures the absolute path of a given relative path _makepath - creates a path name from components _splitpath - break a path name into components I don't believe they are ANSI (as indicated in the docs and by the _ on the front of the fn name) I have written all the fns above before in C. Typically using a #define SEP '/' or #define SEP '\\'. Plus functions to un-mix seperators in any given path (you cant trust users to type anything right!) I reckon they'd be dead simple in D. I might give them a go when I have some spare time. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
prev sibling parent "Carlos Santander B." <carlos8294 msn.com> writes:
"Arcane Jill" <Arcane_member pathlink.com> escribió en el mensaje
news:cbgi0c$sp7$1 digitaldaemon.com
|
| ...
|
| Oh - one other thing I forgot. I think we need functions like basename(),
| dirname(), pathinfo(), realpath() and so on, (stolen from PHP), and some
| function to append a pathname-component to a pathname. Of course, these
things
| are dead easy to do with ordinary string manipulation ... IF you assume
that the
| file separator is "/". But that won't work on a Mac. Such functions would
let us
| manipulate pathnames in a platform independent way. (These should go in
| std.file, not std.stream, obviously).
|
| Arcane Jill

std.path

-----------------------
Carlos Santander Bernal
Jun 25 2004
prev sibling next sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Thu, 24 Jun 2004 19:04:16 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:
 In article <cbf0lp$1lvi$1 digitaldaemon.com>, Ben Hinkle says...

 I'd first like to see what Sean does with std.stream plus I tend to 
 agree
 with Matthew that more discussion is needed before jumping too soon in 
 any
 one direction. There's a lot of cool stuff in mango that I'd love to see
 somehow merged with std.stream if possible. Maybe I'll give a poke at
 porting Kris's tokenizers and endian stuff to std.stream just to see 
 what
 it looks like.

I'd be quite happy if std.stream were to be improved. Here are some suggestions. You'll probably think that many of them are trivial, but each, in their own way, contributes a small amount of annoyance, and I'm sure these things could be easily got rid of. 1) Since it is more normal to want buffered file access than non-buffered file access (in C, fopen() is called more often than open()), it makes sense that File should be buffered by default, and there should be a separate class, maybe called RawFileStream or something, for the unbuffered case.

I'd call it RawFile, and it should mirror open/close etc.
 2) File should in any case be renamed FileStream

I think it should stay File, and should mirror fopen etc. If we want to stream it, we pass it into the constructor of a Stream or BufferedStream (File being buffered would not generally get passed to this as it would be a waste of time) etc.
 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and 
 Filemode.OUT
 respectively.

I agree.
 4) It should be possible to construct a File object in create mode, in 
 one step.
 As in File f = new File(filename, FileMode.CREATE);

We need flags to allow us to mirror these fopen modes: "r" Opens for reading. If the file does not exist or cannot be found, the fopen call fails. "w" Opens an empty file for writing. If the given file exists, its contents are destroyed. "a" Opens for writing at the end of the file (appending) without removing the EOF marker before writing new data to the file; creates the file first if it doesn’t exist. "r+" Opens for both reading and writing. (The file must exist.) "w+" Opens an empty file for both reading and writing. If the given file exists, its contents are destroyed. "a+" Opens for reading and appending; the appending operation includes the removal of the EOF marker before new data is written to the file and the EOF marker is restored after writing is complete; creates the file first if it doesn’t exist. So it seems we need: "r" - READ - read, fails if file does not exist. "w" - CREATE - write, overwrite existing. "a" - APPEND - write, create if not exist. "r+" - READ|WRITE - read, write, fails if file does not exist. "w+" - READ|CREATE - read, write, overwrite existing. "a+" - READ|APPEND - read, append, create if not exist.
 5) In fact, all possible combinations of file opening supported by 
 fopen()
 should be supported by File. It should be possible to assert that the 
 file does
 or does not exist before opening it (atomically), to truncate or not 
 truncate,
 to position the file pointer at the start or end of the file, to allow
 append-only access, etc.

Agreed. See above. Also might want WRITE|NEW - write, fail if file exist.
 6) The destructor should always close the file

As long as there is a way to dup the underlying handle so we can store and re-use it if desired then I agree.
 7) EITHER Stream classes should be auto (likely to be an unpopular 
 suggestion, I
 know), OR there should be an auto wrapper class that you can construct 
 from a
 Stream, in order to guarantee that the file will be closed in the event 
 of an
 exception (which could of course be thrown by ANY piece of code). 
 Currently we
 have to either roll our own auto wrapper, or use a try/catch block.

Shouldn't file be auto? And closed in the destructor.
 8) Documentation should be complete and accurate.

But of course. In an ideal world.. give it a little time. Perhaps the docs are not there because the author of std.stream is not yet happy with it?
 9) There should be a FilterStream class, from which BufferedStream 
 inherits, so
 that we can write our own stream filters. (Java does this. It's neat).

Definately.
 10) Streams don't necessarily have to do transcoding (see - I learnt a 
 new
 word), but nonetheless it should be POSSIBLE to construct them from a
 Reader/Writer in order to make such extensions possible in the future.

Why not simply use a filter as mentioned above to transcode?
 11) I want the function available(), as Java has. A buffered stream 
 always knows
 how much it's got left in its buffer, and I have no problem with an 
 unbuffered
 stream returning zero.

Isn't this true for a normal unbuffered file as well. at the point of opening you know how big it is. it could grow.. but until you reach that initial size you know there is more or not etc.
 12) stdin, stdout and stderr should be globally available D streams. 
 (Maybe they
 are already, but point (8) means there's a lot I don't know about 
 existing
 capabilities)

They are.
 13) Streams should overload the << and >> operators. (Someone suggested 
 using ~.
 That would be fine too).

I have never liked << and >>. I dont think ~ is quite the right thing. It mostly makes sense for writing to a stream, but not reading. Whatever we use it should be efficient i.e. this statement Stream s; s ~= "regan" ~ "was" ~ "here" should not append "here" to "was" then that to "regan" then send it to the stream, as this is inefficient esp in comparrison to a buffered stream.
 None of these is particularly difficult in and of itself, but together 
 they add
 up to a frustrating gripe list. But I'm fairly confident that if these 
 flaws are
 fixed (along with any other gripes which others may mention in the 
 course of
 this thread) then I imagine that most people will be pretty happy with 
 new
 improved std.stream.

Definately.
 This will probably open up rat-holes, but two quick examples of things 
 to
 discuss:

 1) in mango it looks like to open a file and read it you need to create 
 a
 FileConduit and pass that to a Reader constructor. So you have to grok 
 the
 difference between Conduits and Readers/Writers (and maybe Buffers? I
 notice IConduit has a createBuffer method so is it not buffered by 
 default?
 I'm not sure). In std.stream you make one object and there is less to 
 grok.
 The flexibility of mango is probably nice but it adds complexity. Each
 person has a different notion of where to draw the boundaries.

But there is logic behind it. Currently, D does no transcoding - that is, writeLine() will spit out raw UTF-8. Now that's fine if your output is going to a text file, but if it's going to a console, you're screwed. Now you COULD simplify this a bit by "automatically" encoding the output in the operating system default encoding - but that would just reverse the problem. Now, output to the console would be fine, but output destined to leave your machine and end up on someone else's machine (e.g. text file, socket, etc.) would also be similarly munged. UTF-8 is pretty much the best portable format, so ideally you only want to encode at the last minute, just before the stream hits the user.

I think a stream should write raw what you give it, meaning you have to use a filter to trancode/convert etc your data to the correct format for the destination. We can design standard filters for common things like the console etc, and these filters can be automatically plugged into stdin/stdout etc.
 2) in mango to use object serialization/deserialization you register an
 instance of a class so that means at startup you basically have to
 instantiate one of every class that might want to be deserialized. Seems
 wastful and it could affect class design to avoid having classes that 
 have
 interdependencies.

I'm not convinced that serialization necessarily has anything to do with streams. You could serialize to a string, or an in-memory buffer. I guess that would be faster for small objects but disadvantageous for very large ones. In any case, you don't need to decide on a firm serialization policy in order to make streams feel nice. That can come later, once we're happy with the basics.

I think serialization could be done with filters. Just like transcoding can. You just plug all the filters together in between your source and destination streams. simple as that. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 24 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr94eenw35a2sq9 digitalmars.com>, Regan Heath says...

If we want to stream it [File], we pass it into the constructor of a Stream or 
BufferedStream

A File /IS/ a stream. How could it not be? Sorry, I just didn't understand you here.
 11) I want the function available(), as Java has. A buffered stream 
 always knows
 how much it's got left in its buffer, and I have no problem with an 
 unbuffered
 stream returning zero.

Isn't this true for a normal unbuffered file as well. at the point of opening you know how big it is. it could grow.. but until you reach that initial size you know there is more or not etc.

Ah - now it's I who was misunderstood. Allow me clarify. available() must return a number which is less than or equal to the number of bytes which may be read from a stream ... and this is the important part ... WITHOUT BLOCKING. available() MUST return immediately, without causing a thread-switch. It must *NOT* return the number of bytes left in a file - unless all of them are already buffered. This is SO important in bits of code which MUST NOT WAIT. Arcane Jill
Jun 25 2004
next sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Arcane Jill"  wrote ...
 Ah - now it's I who was misunderstood. Allow me clarify. available() must

 a number which is less than or equal to the number of bytes which may be

 from a stream ... and this is the important part ... WITHOUT BLOCKING.
 available() MUST return immediately, without causing a thread-switch. It

 *NOT* return the number of bytes left in a file - unless all of them are

 buffered.

 This is SO important in bits of code which MUST NOT WAIT.

 Arcane Jill

May I enquire, Jill, as to why you need such functionality? I'm thinking at the 50,000' level rather than the intimate details of some IO implementation. It's always useful to understand the application. Secondly, if the IO were always buffered, and you had access to the content thereof (plus the number of readable bytes), would that satisfy the requirement? - Kris
Jun 25 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <cbgjnv$105r$1 digitaldaemon.com>, Kris says...

May I enquire, Jill, as to why you need such functionality? I'm thinking at
the 50,000' level rather than the intimate details of some IO
implementation. It's always useful to understand the application.

For example, consider a cryptographically secure random number stream. You'd want the ultra-secure version which always blocks until sufficient entropy is available - no problem there - but some folk would also want a non-blocking (less secure) version (like the difference between Unix's /dev/random and /dev/urandom). The non-blocking version would call available() on the entropy stream before trying to collect the entropy, in order to provide a guarantee of non-blocking. If bytes were available, it could read them, and be as secure as possible. If bytes were not available it could re-stir the exising entropy pool, and still return immediately. This sort of thing is absolutely crucial in crypto.
Secondly, if the IO were always buffered, and you had access to the content
thereof (plus the number of readable bytes), would that satisfy the
requirement?

Not all streams which are able to deliver bytes on demand without waiting necessarily have a buffer. I have a proof-of-concept stream in my in-progress crypto random stuff which simply delivers bytes by calling rand(). Such a stream will never block, and it's available() function could simply always return 2, or 128, or any other arbitrary number. It does not, however, have a buffer to return. Access to the contents of an internal buffer implies a certain implementation. This assumption may not always be correct, or relevant. available(), by itself, would be enough. Thereafter you could get the "buffer contents" with a straightforward read(). Arcane Jill
Jun 25 2004
next sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Arcane Jill"  wrote
 Not all streams which are able to deliver bytes on demand without waiting
 necessarily have a buffer. I have a proof-of-concept stream in my

 crypto random stuff which simply delivers bytes by calling rand(). Such a

 will never block, and it's available() function could simply always return

 128, or any other arbitrary number. It does not, however, have a buffer to
 return.

Right; poor phrasing on my part. In terms of D exposure, would something like an IAvailable interface suffice? If so, what about the equivalent for writing? Is there a similar need to never perform a thread-switch? - Kris
Jun 25 2004
parent Arcane Jill <Arcane_member pathlink.com> writes:
In article <cbgm5b$13rn$1 digitaldaemon.com>, Kris says...
"Arcane Jill"  wrote
 Not all streams which are able to deliver bytes on demand without waiting
 necessarily have a buffer. I have a proof-of-concept stream in my

 crypto random stuff which simply delivers bytes by calling rand(). Such a

 will never block, and it's available() function could simply always return

 128, or any other arbitrary number. It does not, however, have a buffer to
 return.

Right; poor phrasing on my part. In terms of D exposure, would something like an IAvailable interface suffice?

I don't know, because I don't know what IAvailable does. I suspect not, however. Essentially, I need to write a non-blocking stream filter, which transforms one (underlying) stream into another. Thus, you'd construct a Jill's Stream from a std.stream - which guarantees that the underlying stream has read(), write(), etc., (and available(), if such a function is added to stream). Now, if, instead, Jill's Stream were to be constructed from an IAvailable, then you'd guarantee that underlying - thing - had an available() function, but, unless I've misunderstood, it would NOT have read() and write(), which kinda makes it useless. But I don't see the problem with simply adding available() to std.stream. It's a VERY easy function to add - in most cases it could be implemented as { return 0; }
If so, what about the equivalent for
writing? Is there a similar need to never perform a thread-switch?

- Kris

I can't think of one. I think that the fact that Java has available() for reading, but not for writing, is a clue that this is kind of the desirable thing to do. (Not that Java always gets things right, of course, but in this case it would be spot on for my needs). Arcane Jill
Jun 25 2004
prev sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 07:51:12 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:
 In article <cbgjnv$105r$1 digitaldaemon.com>, Kris says...

 May I enquire, Jill, as to why you need such functionality? I'm 
 thinking at
 the 50,000' level rather than the intimate details of some IO
 implementation. It's always useful to understand the application.

For example, consider a cryptographically secure random number stream. You'd want the ultra-secure version which always blocks until sufficient entropy is available - no problem there - but some folk would also want a non-blocking (less secure) version (like the difference between Unix's /dev/random and /dev/urandom). The non-blocking version would call available() on the entropy stream before trying to collect the entropy, in order to provide a guarantee of non-blocking. If bytes were available, it could read them, and be as secure as possible. If bytes were not available it could re-stir the exising entropy pool, and still return immediately. This sort of thing is absolutely crucial in crypto.
 Secondly, if the IO were always buffered, and you had access to the 
 content
 thereof (plus the number of readable bytes), would that satisfy the
 requirement?

Not all streams which are able to deliver bytes on demand without waiting necessarily have a buffer. I have a proof-of-concept stream in my in-progress crypto random stuff which simply delivers bytes by calling rand(). Such a stream will never block, and it's available() function could simply always return 2, or 128, or any other arbitrary number. It does not, however, have a buffer to return. Access to the contents of an internal buffer implies a certain implementation. This assumption may not always be correct, or relevant. available(), by itself, would be enough. Thereafter you could get the "buffer contents" with a straightforward read().

Lets assume for the sake of it that we're talking about a FileStream.. so.. you open it, then you call available() it returns 0 as there is nothing in the buffer.. yet.. you (return immediately) and do something else.. at which point does the stream actually read something into it's buffer? To me it seems something has to tell it to read stuff into it's buffer. Either: - it does this on open - you have another thread constantly polling it telling it to read - something else? Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
next sibling parent Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr95c8xsi5a2sq9 digitalmars.com>, Regan Heath says...

Lets assume for the sake of it that we're talking about a FileStream.. 
so.. you open it, then you call available() it returns 0 as there is 
nothing in the buffer.. yet.. you (return immediately) and do something 
else.. at which point does the stream actually read something into it's 
buffer?

To me it seems something has to tell it to read stuff into it's buffer. 
Either:
   - it does this on open
   - you have another thread constantly polling it telling it to read
   - something else?

Correct. Think of an entropy stream like a pipe, or a socket. One process pulls stuff out. Another process pushes stuff in. You could run an entropy-gathering demon in the background, if you chose. Alternatively, a blocking entropy generator could, on blocking, unstruct the user to "wiggle that mouse" to feed the entropy pool until there's enough there. Strong crypto is hard to get right. But I intend to try, and do better than most. Jill
Jun 25 2004
prev sibling parent reply Sam McCall <tunah.d tunah.net> writes:
Regan Heath wrote:
 Lets assume for the sake of it that we're talking about a FileStream.. 
 so.. you open it, then you call available() it returns 0 as there is 
 nothing in the buffer.. yet.. you (return immediately) and do something 
 else.. at which point does the stream actually read something into it's 
 buffer?

The point is that available() is simply a request for information about the stream, _without_ telling it to do any work. Sam
Jun 25 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Sat, 26 Jun 2004 02:45:45 +1200, Sam McCall <tunah.d tunah.net> wrote:

 Regan Heath wrote:
 Lets assume for the sake of it that we're talking about a FileStream.. 
 so.. you open it, then you call available() it returns 0 as there is 
 nothing in the buffer.. yet.. you (return immediately) and do something 
 else.. at which point does the stream actually read something into it's 
 buffer?

The point is that available() is simply a request for information about the stream, _without_ telling it to do any work. Sam

And my point was, if you only call available, you never tell it to do any work, so there will never *be* anything 'available' :) Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
prev sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 07:08:05 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:
 In article <opr94eenw35a2sq9 digitalmars.com>, Regan Heath says...

 If we want to stream it [File], we pass it into the constructor of a 
 Stream or
 BufferedStream

A File /IS/ a stream. How could it not be? Sorry, I just didn't understand you here.

That's my point. I dont think it should be a stream. fopen etc is not a stream. We want something as a drop in replacement for that, then, we write a stream class, one that will take any class that support read() write() etc. Perhaps my idea of streams is different to the norm?
 11) I want the function available(), as Java has. A buffered stream
 always knows
 how much it's got left in its buffer, and I have no problem with an
 unbuffered
 stream returning zero.

Isn't this true for a normal unbuffered file as well. at the point of opening you know how big it is. it could grow.. but until you reach that initial size you know there is more or not etc.

Ah - now it's I who was misunderstood. Allow me clarify. available() must return a number which is less than or equal to the number of bytes which may be read from a stream ... and this is the important part ... WITHOUT BLOCKING. available() MUST return immediately, without causing a thread-switch. It must *NOT* return the number of bytes left in a file - unless all of them are already buffered.

ahh.. ok this would be part of the BufferedStream class. And it would simply return the # in the buffer. Simple, easy, fast, efficient. :)
 This is SO important in bits of code which MUST NOT WAIT.

 Arcane Jill

-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr95cp4td5a2sq9 digitalmars.com>, Regan Heath says...
 A File /IS/ a stream. How could it not be? Sorry, I just didn't 
 understand you
 here.

That's my point. I dont think it should be a stream. fopen etc is not a stream. We want something as a drop in replacement for that, then, we write a stream class, one that will take any class that support read() write() etc.

That's what most of us mean by "stream". A stream is simply something that does read() and write().
Perhaps my idea of streams is different to the norm?

Could be.
ahh.. ok this would be part of the BufferedStream class. And it would 
simply return the # in the buffer. Simple, easy, fast, efficient. :)

Actually, it should return the number in the buffer PLUS the value returned by the underlying stream's available() function - because the buffered stream can get at least that many bytes from the underlying stream without blocking, and use those bytes to refill its own buffer. (That's assuming that the buffered stream is happy to get less than a bufferful at a time from the underlying stream). And, the function should be present in unbuffered streams too, and in these cases it should return zero. But yes - simple, easy, fast and efficient. And useful. Jill
Jun 25 2004
next sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 11:31:12 +0000 (UTC), Arcane Jill 
<Arcane_member pathlink.com> wrote:

 In article <opr95cp4td5a2sq9 digitalmars.com>, Regan Heath says...
 A File /IS/ a stream. How could it not be? Sorry, I just didn't
 understand you
 here.

That's my point. I dont think it should be a stream. fopen etc is not a stream. We want something as a drop in replacement for that, then, we write a stream class, one that will take any class that support read() write() etc.

That's what most of us mean by "stream". A stream is simply something that does read() and write().

Ahh.. ok. If I revise my internal definition I think I'll get what you all mean.
 Perhaps my idea of streams is different to the norm?

Could be.

Seems likely.
 ahh.. ok this would be part of the BufferedStream class. And it would
 simply return the # in the buffer. Simple, easy, fast, efficient. :)

Actually, it should return the number in the buffer PLUS the value returned by the underlying stream's available() function - because the buffered stream can get at least that many bytes from the underlying stream without blocking, and use those bytes to refill its own buffer. (That's assuming that the buffered stream is happy to get less than a bufferful at a time from the underlying stream).

Unless the underlying stream (new word/concept) i.e. the file cannot give it the data without possibly blocking i.e. disk IO right?
 And, the function should be present in unbuffered streams too, and in 
 these
 cases it should return zero.

Which is sensible as there is 0 in it's buffer, because there is no buffer.
 But yes - simple, easy, fast and efficient. And useful.

I think things like File, Socket etc should be unbuffered. Then we write a BufferedStream template/class which takes any Stream i.e. File or Socket and buffers for them. After all buffering is the same for all, so why write it 10 times, why not just once. If you like you write RawFile and BufferedStream then typedef BufferedStream!(RawFile) File Yes? Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <opr959bek45a2sq9 digitalmars.com>, Regan Heath says...

 ahh.. ok this would be part of the BufferedStream class. And it would
 simply return the # in the buffer. Simple, easy, fast, efficient. :)

Actually, it should return the number in the buffer PLUS the value returned by the underlying stream's available() function - because the buffered stream can get at least that many bytes from the underlying stream without blocking, and use those bytes to refill its own buffer. (That's assuming that the buffered stream is happy to get less than a bufferful at a time from the underlying stream).

Unless the underlying stream (new word/concept) i.e. the file cannot give it the data without possibly blocking i.e. disk IO right?

Still works even then, because if the underlying stream cannot give it the data without possibly blocking then its available() function will return zero. In this case, adding zero to the return value won't exactly hurt.
If you like you write RawFile and BufferedStream then typedef 
BufferedStream!(RawFile) File

Yes?

Fair enough, but the style guide says "meaningless type aliases should be avoided". Anyway, I /did/ say that many of my file/stream gripes were individually trivial. Jill
Jun 26 2004
parent reply "Bent Rasmussen" <exo bent-rasmussen.info> writes:
 Fair enough, but the style guide says "meaningless type aliases should be
 avoided".

E.g. alias int INT; I should think alias BufferedStream!(File) BufferedFile; is less meaningless. But is there any substantial reason for File not to be an unbuffered except havint to type BufferedFile instead of File? It seems awkward to screw with the names in this way to save a couple of keystrokes, although I like short names.
Jun 26 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Sat, 26 Jun 2004 18:03:46 +0200, Bent Rasmussen 
<exo bent-rasmussen.info> wrote:

 Fair enough, but the style guide says "meaningless type aliases should 
 be
 avoided".

E.g. alias int INT; I should think alias BufferedStream!(File) BufferedFile; is less meaningless.

I agree. but why not.
     alias BufferedStream!(RawFile) File;

as in most cases people want a buffer file. But.. you may want an unbuffered one.
 But is there any substantial reason for File not to be an unbuffered 
 except
 havint to type BufferedFile instead of File? It seems awkward to screw 
 with
 the names in this way to save a couple of keystrokes, although I like 
 short
 names.

"File" and "RawFile" short and to the point. :0) Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 26 2004
parent reply "Bent Rasmussen" <exo bent-rasmussen.info> writes:
 I agree. but why not.

     alias BufferedStream!(RawFile) File;

as in most cases people want a buffer file. But.. you may want an unbuffered one.

I follow the principle that if a' is an extension of a, then the name of a' signals that. I don't, in general, choose a name for a which is more complex than the name of a'. Of course it is nice to have the common case have a short name, but its hardly difficult to remember it once you've learned it, and a shorter name can perhaps be found. A prominent exception is uint vs int, where the simple type has a name that expresses an exception to a less simple type, but either I'm used to it or its just so discrete that it doesn't bother me. And if both types have non-composite names, then the whole "issue" disappears (e.g. natural, integer, real.)
 "File" and "RawFile" short and to the point. :0)

It is. Don't mind me. :)
 Regan

Jun 27 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Sun, 27 Jun 2004 13:19:05 +0200, Bent Rasmussen 
<exo bent-rasmussen.info> wrote:

 I agree. but why not.

     alias BufferedStream!(RawFile) File;

as in most cases people want a buffer file. But.. you may want an unbuffered one.

I follow the principle that if a' is an extension of a, then the name of a' signals that. I don't, in general, choose a name for a which is more complex than the name of a'.

I agree, this is good logical progression.
 Of course it is nice to have the common case have a short name, but its
 hardly difficult to remember it once you've learned it, and a shorter 
 name
 can perhaps be found.

True. Because it's logical. You see BufferedFile you guess File might exist.
 A prominent exception is uint vs int, where the simple type has a name 
 that
 expresses an exception to a less simple type,

Which is the simple type? uint or int?
 but either I'm used to it or
 its just so discrete that it doesn't bother me. And if both types have
 non-composite names, then the whole "issue" disappears (e.g. natural,
 integer, real.)

Basically I think it's convienience vs logical progression.
 "File" and "RawFile" short and to the point. :0)

It is. Don't mind me. :)
 Regan


-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 27 2004
prev sibling parent Derek <derek psyc.ward> writes:
On Fri, 25 Jun 2004 11:31:12 +0000 (UTC), Arcane Jill wrote:

 In article <opr95cp4td5a2sq9 digitalmars.com>, Regan Heath says...
 A File /IS/ a stream. How could it not be? Sorry, I just didn't 
 understand you
 here.

That's my point. I dont think it should be a stream. fopen etc is not a stream. We want something as a drop in replacement for that, then, we write a stream class, one that will take any class that support read() write() etc.

That's what most of us mean by "stream". A stream is simply something that does read() and write().

I regard streams as something that allow reads/writes of arbitary number of bytes. Other file-types only allow read/write of records, rather than bytes. I only became aware of streams when I left the IBM mainframe world. -- Derek Melbourne, Australia
Jun 25 2004
prev sibling next sibling parent reply "Carlos Santander B." <carlos8294 msn.com> writes:
"Arcane Jill" <Arcane_member pathlink.com> escribió en el mensaje
news:cbf8jg$221b$1 digitaldaemon.com
|
| ...
|
| 7) EITHER Stream classes should be auto (likely to be an unpopular
suggestion, I
| know), OR there should be an auto wrapper class that you can construct
from a
| Stream, in order to guarantee that the file will be closed in the event of
an
| exception (which could of course be thrown by ANY piece of code).
Currently we
| have to either roll our own auto wrapper, or use a try/catch block.
|
| ...
|
| Arcane Jill

You can do:

auto File myFile = new File (...);

And it'll work just as if File was auto.

-----------------------
Carlos Santander Bernal
Jun 24 2004
parent "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Carlos Santander B."  wrote
 You can do:

 auto File myFile = new File (...);

 And it'll work just as if File was auto.

Yeah, isn't that cool? I think it's the biz.
Jun 24 2004
prev sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Arcane Jill wrote:

 In article <cbf0lp$1lvi$1 digitaldaemon.com>, Ben Hinkle says...
 
I'd first like to see what Sean does with std.stream plus I tend to agree
with Matthew that more discussion is needed before jumping too soon in any
one direction. There's a lot of cool stuff in mango that I'd love to see
somehow merged with std.stream if possible. Maybe I'll give a poke at
porting Kris's tokenizers and endian stuff to std.stream just to see what
it looks like.

I'd be quite happy if std.stream were to be improved. Here are some suggestions. You'll probably think that many of them are trivial, but each, in their own way, contributes a small amount of annoyance, and I'm sure these things could be easily got rid of. 1) Since it is more normal to want buffered file access than non-buffered file access (in C, fopen() is called more often than open()), it makes sense that File should be buffered by default, and there should be a separate class, maybe called RawFileStream or something, for the unbuffered case.

Funny you should suggest RawFileStream because the first version of stream.d that I sent Walter had "RawFile" and "File" instead of "File" and "BufferedFile". I decided to go with File and BufferedFile for backwards compatiblity and to avoid buffering stdin/out (unless the type of stdin/out was changed to Stream instead of File). Going back to buffering by default is probably better long-term.
 2) File should in any case be renamed FileStream

but what else would a File be? ;-) Personally I like the analogy with stdio's FILE.
 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and
 Filemode.OUT respectively.

why not FileMode? The dmd "style guide" page indicates FileMode would be preferred. The style guide also says all enums should be caps so IN/OUT seems right (though I tend to think we should move away from the historical baggage of C's preprocessor since FileMode.In doesn't look to me like a variable anything else besides a constant).
 4) It should be possible to construct a File object in create mode, in one
 step. As in File f = new File(filename, FileMode.CREATE);

yup. or'able with In/Out.
 5) In fact, all possible combinations of file opening supported by fopen()
 should be supported by File. It should be possible to assert that the file
 does or does not exist before opening it (atomically), to truncate or not
 truncate, to position the file pointer at the start or end of the file, to
 allow append-only access, etc.

 6) The destructor should always close the file

 7) EITHER Stream classes should be auto (likely to be an unpopular
 suggestion, I know), OR there should be an auto wrapper class that you can
 construct from a Stream, in order to guarantee that the file will be
 closed in the event of an exception (which could of course be thrown by
 ANY piece of code). Currently we have to either roll our own auto wrapper,
 or use a try/catch block.
 
 8) Documentation should be complete and accurate.
 
 9) There should be a FilterStream class, from which BufferedStream
 inherits, so that we can write our own stream filters. (Java does this.
 It's neat).
 
 10) Streams don't necessarily have to do transcoding (see - I learnt a new
 word), but nonetheless it should be POSSIBLE to construct them from a
 Reader/Writer in order to make such extensions possible in the future.
 
 11) I want the function available(), as Java has. A buffered stream always
 knows how much it's got left in its buffer, and I have no problem with an
 unbuffered stream returning zero.
 
 12) stdin, stdout and stderr should be globally available D streams.
 (Maybe they are already, but point (8) means there's a lot I don't know
 about existing capabilities)

All look very reasonable.
 13) Streams should overload the << and >> operators. (Someone suggested
 using ~. That would be fine too).

I think Walter is hoping the typesafe varargs changes will remove a important motivation for adding << and >> (though there is the question of run-time vs compile-time safety). I have been playing around with making std.stream's printf typesafe and that's why I was trying to rebuild the phobos unittests.
 None of these is particularly difficult in and of itself, but together
 they add up to a frustrating gripe list. But I'm fairly confident that if
 these flaws are fixed (along with any other gripes which others may
 mention in the course of this thread) then I imagine that most people will
 be pretty happy with new improved std.stream.
 
 
 
 
 
This will probably open up rat-holes, but two quick examples of things to
discuss:

1) in mango it looks like to open a file and read it you need to create a
FileConduit and pass that to a Reader constructor. So you have to grok the
difference between Conduits and Readers/Writers (and maybe Buffers? I
notice IConduit has a createBuffer method so is it not buffered by
default? I'm not sure). In std.stream you make one object and there is
less to grok. The flexibility of mango is probably nice but it adds
complexity. Each person has a different notion of where to draw the
boundaries.

But there is logic behind it. Currently, D does no transcoding - that is, writeLine() will spit out raw UTF-8. Now that's fine if your output is going to a text file, but if it's going to a console, you're screwed. Now you COULD simplify this a bit by "automatically" encoding the output in the operating system default encoding - but that would just reverse the problem. Now, output to the console would be fine, but output destined to leave your machine and end up on someone else's machine (e.g. text file, socket, etc.) would also be similarly munged. UTF-8 is pretty much the best portable format, so ideally you only want to encode at the last minute, just before the stream hits the user.
2) in mango to use object serialization/deserialization you register an
instance of a class so that means at startup you basically have to
instantiate one of every class that might want to be deserialized. Seems
wastful and it could affect class design to avoid having classes that have
interdependencies.

I'm not convinced that serialization necessarily has anything to do with streams. You could serialize to a string, or an in-memory buffer. I guess that would be faster for small objects but disadvantageous for very large ones. In any case, you don't need to decide on a firm serialization policy in order to make streams feel nice. That can come later, once we're happy with the basics. Arcane Jill

Jun 25 2004
next sibling parent Arcane Jill <Arcane_member pathlink.com> writes:
In article <cbh3le$1o0j$1 digitaldaemon.com>, Ben Hinkle says...

 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and
 Filemode.OUT respectively.

why not FileMode?

Sorry, that was a typo. I meant FileMode.IN and FileMode.OUT. It was the IN and OUT that I wanted to suggest changing, not FileMode. (Although in hindsight, perhaps it should be Stream.IN and Stream.OUT anyway). Jill
Jun 25 2004
prev sibling next sibling parent reply Sam McCall <tunah.d tunah.net> writes:
Ben Hinkle wrote:

2) File should in any case be renamed FileStream

but what else would a File be? ;-) Personally I like the analogy with stdio's FILE.

Well, Java uses File to represent a path. I like the idea of File f=new File("C:\\something.txt"); Tests: f.exists() f.isFile() // exists and is regular file f.isDirectory() // exists and is directory f.isDevice() // unix devices, (windows COM1 etc?) f.canRead() f.canWrite() and then f.open("r+") // returns a FileStream f.open("a",false) // returns a RawFileStream, //optional arguments win again Sam
Jun 25 2004
parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Sam McCall wrote:

 Ben Hinkle wrote:
 
2) File should in any case be renamed FileStream

but what else would a File be? ;-) Personally I like the analogy with stdio's FILE.

Well, Java uses File to represent a path. I like the idea of File f=new File("C:\\something.txt"); Tests: f.exists() f.isFile() // exists and is regular file f.isDirectory() // exists and is directory f.isDevice() // unix devices, (windows COM1 etc?) f.canRead() f.canWrite() and then f.open("r+") // returns a FileStream f.open("a",false) // returns a RawFileStream, //optional arguments win again Sam

I don't think this is one of Java's highlights, though. A File class shouldn't include directories. Users have a pretty good understanding of what the word "File" means and it doesn't include directories. I think the general consensus is that Java's File class should have been called something like Path.
Jun 25 2004
next sibling parent "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:cbiove$12pm$1 digitaldaemon.com...
 Sam McCall wrote:

 Ben Hinkle wrote:

2) File should in any case be renamed FileStream

but what else would a File be? ;-) Personally I like the analogy with stdio's FILE.

Well, Java uses File to represent a path. I like the idea of File f=new File("C:\\something.txt"); Tests: f.exists() f.isFile() // exists and is regular file f.isDirectory() // exists and is directory f.isDevice() // unix devices, (windows COM1 etc?) f.canRead() f.canWrite() and then f.open("r+") // returns a FileStream f.open("a",false) // returns a RawFileStream, //optional arguments win again Sam

I don't think this is one of Java's highlights, though. A File class shouldn't include directories. Users have a pretty good understanding of what the word "File" means and it doesn't include directories. I think the general consensus is that Java's File class should have been called something like Path.

I would tend to agree with Ben, but Path should be for working with file paths only. Given the subject title, I think it's cool to point out that mango.io places such "attribute" methods in a related class called FileProxy. That way they don't interfere with or pollute mango.io.FilePath. FileProxy takes a FilePath as a constructor argument and, if/when you're ready to actually use the file, FileConduit accepts either FilePath or FileProxy in its constructor. Decent names are always the hardest thing to come up with though. I usually fail miserably.
Jun 25 2004
prev sibling parent Derek <derek psyc.ward> writes:
On Fri, 25 Jun 2004 23:02:14 -0400, Ben Hinkle wrote:

 Sam McCall wrote:
 
 Ben Hinkle wrote:
 
2) File should in any case be renamed FileStream

but what else would a File be? ;-) Personally I like the analogy with stdio's FILE.

Well, Java uses File to represent a path. I like the idea of File f=new File("C:\\something.txt"); Tests: f.exists() f.isFile() // exists and is regular file f.isDirectory() // exists and is directory f.isDevice() // unix devices, (windows COM1 etc?) f.canRead() f.canWrite() and then f.open("r+") // returns a FileStream f.open("a",false) // returns a RawFileStream, //optional arguments win again Sam

I don't think this is one of Java's highlights, though. A File class shouldn't include directories. Users have a pretty good understanding of what the word "File" means and it doesn't include directories. I think the general consensus is that Java's File class should have been called something like Path.

I'm sure that this comes from Unix and other operating systems in which a directory is just a special sort of file - one that contains a list of files. -- Derek Melbourne, Australia
Jun 26 2004
prev sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 07:52:21 -0400, Ben Hinkle <bhinkle4 juno.com> wrote:

 Arcane Jill wrote:

 In article <cbf0lp$1lvi$1 digitaldaemon.com>, Ben Hinkle says...

 I'd first like to see what Sean does with std.stream plus I tend to 
 agree
 with Matthew that more discussion is needed before jumping too soon in 
 any
 one direction. There's a lot of cool stuff in mango that I'd love to 
 see
 somehow merged with std.stream if possible. Maybe I'll give a poke at
 porting Kris's tokenizers and endian stuff to std.stream just to see 
 what
 it looks like.

I'd be quite happy if std.stream were to be improved. Here are some suggestions. You'll probably think that many of them are trivial, but each, in their own way, contributes a small amount of annoyance, and I'm sure these things could be easily got rid of. 1) Since it is more normal to want buffered file access than non-buffered file access (in C, fopen() is called more often than open()), it makes sense that File should be buffered by default, and there should be a separate class, maybe called RawFileStream or something, for the unbuffered case.

Funny you should suggest RawFileStream because the first version of stream.d that I sent Walter had "RawFile" and "File" instead of "File" and "BufferedFile". I decided to go with File and BufferedFile for backwards compatiblity and to avoid buffering stdin/out (unless the type of stdin/out was changed to Stream instead of File). Going back to buffering by default is probably better long-term.

i.e. template BuffereStream(T) { ..etc.. } class RawFile { read(); write(); } typedef BufferedStream!(RawFile) File;
 2) File should in any case be renamed FileStream

but what else would a File be? ;-) Personally I like the analogy with stdio's FILE.

Using my new found knowledge that whatever implements read() and write() is a stream, I agree. Why waste time typing FileStream when we can type File.
 3) FileMode.In and FileMode.Out should be renamed Filemode.IN and
 Filemode.OUT respectively.

why not FileMode? The dmd "style guide" page indicates FileMode would be preferred. The style guide also says all enums should be caps so IN/OUT seems right (though I tend to think we should move away from the historical baggage of C's preprocessor since FileMode.In doesn't look to me like a variable anything else besides a constant).

I find having to type FileMode. all the time annoying. After all the variable required is a FileMode so logically anything I pass must be a FileMode. so why type it? I'd rather... File f; f.open("foo.txt",READ|WRITE|APPEND); Surely this is possible?
 4) It should be possible to construct a File object in create mode, in 
 one
 step. As in File f = new File(filename, FileMode.CREATE);

yup. or'able with In/Out.

Definately. See my other post, I think we need: "r" - READ - read, fails if file does not exist. "w" - CREATE - write, overwrite existing. "a" - APPEND - write, create if not exist. "r+" - READ|WRITE - read, write, fails if file does not exist. "w+" - READ|CREATE - read, write, overwrite existing. "a+" - READ|APPEND - read, append, create if not exist. to emulte all the fopen options, which IMO is a basic requirement, and WRITE|NEW - write, fail if file exist. to give us more options than fopen does. :)
 5) In fact, all possible combinations of file opening supported by 
 fopen()
 should be supported by File. It should be possible to assert that the 
 file
 does or does not exist before opening it (atomically), to truncate or 
 not
 truncate, to position the file pointer at the start or end of the file, 
 to
 allow append-only access, etc.

 6) The destructor should always close the file

 7) EITHER Stream classes should be auto (likely to be an unpopular
 suggestion, I know), OR there should be an auto wrapper class that you 
 can
 construct from a Stream, in order to guarantee that the file will be
 closed in the event of an exception (which could of course be thrown by
 ANY piece of code). Currently we have to either roll our own auto 
 wrapper,
 or use a try/catch block.

 8) Documentation should be complete and accurate.

 9) There should be a FilterStream class, from which BufferedStream
 inherits, so that we can write our own stream filters. (Java does this.
 It's neat).

 10) Streams don't necessarily have to do transcoding (see - I learnt a 
 new
 word), but nonetheless it should be POSSIBLE to construct them from a
 Reader/Writer in order to make such extensions possible in the future.

 11) I want the function available(), as Java has. A buffered stream 
 always
 knows how much it's got left in its buffer, and I have no problem with 
 an
 unbuffered stream returning zero.

 12) stdin, stdout and stderr should be globally available D streams.
 (Maybe they are already, but point (8) means there's a lot I don't know
 about existing capabilities)

All look very reasonable.
 13) Streams should overload the << and >> operators. (Someone suggested
 using ~. That would be fine too).

I think Walter is hoping the typesafe varargs changes will remove a important motivation for adding << and >> (though there is the question of run-time vs compile-time safety). I have been playing around with making std.stream's printf typesafe and that's why I was trying to rebuild the phobos unittests.
 None of these is particularly difficult in and of itself, but together
 they add up to a frustrating gripe list. But I'm fairly confident that 
 if
 these flaws are fixed (along with any other gripes which others may
 mention in the course of this thread) then I imagine that most people 
 will
 be pretty happy with new improved std.stream.





 This will probably open up rat-holes, but two quick examples of things 
 to
 discuss:

 1) in mango it looks like to open a file and read it you need to 
 create a
 FileConduit and pass that to a Reader constructor. So you have to grok 
 the
 difference between Conduits and Readers/Writers (and maybe Buffers? I
 notice IConduit has a createBuffer method so is it not buffered by
 default? I'm not sure). In std.stream you make one object and there is
 less to grok. The flexibility of mango is probably nice but it adds
 complexity. Each person has a different notion of where to draw the
 boundaries.

But there is logic behind it. Currently, D does no transcoding - that is, writeLine() will spit out raw UTF-8. Now that's fine if your output is going to a text file, but if it's going to a console, you're screwed. Now you COULD simplify this a bit by "automatically" encoding the output in the operating system default encoding - but that would just reverse the problem. Now, output to the console would be fine, but output destined to leave your machine and end up on someone else's machine (e.g. text file, socket, etc.) would also be similarly munged. UTF-8 is pretty much the best portable format, so ideally you only want to encode at the last minute, just before the stream hits the user.
 2) in mango to use object serialization/deserialization you register an
 instance of a class so that means at startup you basically have to
 instantiate one of every class that might want to be deserialized. 
 Seems
 wastful and it could affect class design to avoid having classes that 
 have
 interdependencies.

I'm not convinced that serialization necessarily has anything to do with streams. You could serialize to a string, or an in-memory buffer. I guess that would be faster for small objects but disadvantageous for very large ones. In any case, you don't need to decide on a firm serialization policy in order to make streams feel nice. That can come later, once we're happy with the basics. Arcane Jill


-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
parent reply "Carlos Santander B." <carlos8294 msn.com> writes:
"Regan Heath" <regan netwin.co.nz> escribió en el mensaje
news:opr959lrqc5a2sq9 digitalmars.com
|
| ...
|
| I find having to type FileMode. all the time annoying. After all the
| variable required is a FileMode so logically anything I pass must be a
| FileMode. so why type it?
|
| I'd rather...
|
| File f;
| f.open("foo.txt",READ|WRITE|APPEND);
|
| Surely this is possible?
|
| ...
|

The problem is when you have different modules, where each can define READ,
etc., in there own particular, conflicting way (sorry, can't think of an
example). If you fully qualify them, then there'll be no conflict at all
(unless they define FileMode too, but that wouldn't be too likely).

-----------------------
Carlos Santander Bernal
Jun 25 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Fri, 25 Jun 2004 17:56:01 -0500, Carlos Santander B. 
<carlos8294 msn.com> wrote:

 "Regan Heath" <regan netwin.co.nz> escribió en el mensaje
 news:opr959lrqc5a2sq9 digitalmars.com
 |
 | ...
 |
 | I find having to type FileMode. all the time annoying. After all the
 | variable required is a FileMode so logically anything I pass must be a
 | FileMode. so why type it?
 |
 | I'd rather...
 |
 | File f;
 | f.open("foo.txt",READ|WRITE|APPEND);
 |
 | Surely this is possible?
 |
 | ...
 |

 The problem is when you have different modules, where each can define 
 READ,
 etc., in there own particular, conflicting way (sorry, can't think of an
 example). If you fully qualify them, then there'll be no conflict at all
 (unless they define FileMode too, but that wouldn't be too likely).

I see what you're saying, but I don't think there is a conflict, because... enum { READ,WRITE } FileMode; enum { READ,WRITE } ReganMode; void foo(FileMode a); void bar(ReganMode a); foo(READ); bar(READ); when the compiler see's foo, it knows it needs a FileMode, thus READ is interpreted as FileMode.READ. when the compiler see's bar, it knows it needs a ReganMode, thus READ is interpreted as ReganMode.READ. Regan.
 -----------------------
 Carlos Santander Bernal

-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Sat, 26 Jun 2004 11:08:44 +1200, Regan Heath <regan netwin.co.nz> wrote:

 On Fri, 25 Jun 2004 17:56:01 -0500, Carlos Santander B. 
 <carlos8294 msn.com> wrote:

 "Regan Heath" <regan netwin.co.nz> escribió en el mensaje
 news:opr959lrqc5a2sq9 digitalmars.com
 |
 | ...
 |
 | I find having to type FileMode. all the time annoying. After all the
 | variable required is a FileMode so logically anything I pass must be a
 | FileMode. so why type it?
 |
 | I'd rather...
 |
 | File f;
 | f.open("foo.txt",READ|WRITE|APPEND);
 |
 | Surely this is possible?
 |
 | ...
 |

 The problem is when you have different modules, where each can define 
 READ,
 etc., in there own particular, conflicting way (sorry, can't think of an
 example). If you fully qualify them, then there'll be no conflict at all
 (unless they define FileMode too, but that wouldn't be too likely).

I see what you're saying, but I don't think there is a conflict, because... enum { READ,WRITE } FileMode; enum { READ,WRITE } ReganMode; void foo(FileMode a); void bar(ReganMode a); foo(READ); bar(READ); when the compiler see's foo, it knows it needs a FileMode, thus READ is interpreted as FileMode.READ. when the compiler see's bar, it knows it needs a ReganMode, thus READ is interpreted as ReganMode.READ.

This is kinda like but not quite the same as the 'with' statement, and actually more similar to the C++ namespace idea. When processing the 'a' variable for 'foo' the namespace is set to that of the 'FileMode' so READ matches the one in FileMode. So you could concievable say... foo(READ|cast(FileMode)ReganMode.WRITE); as you're explicitly stating the namespace of WRITE. the cast above is required as ReganMode is not a FileMode. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 25 2004
prev sibling next sibling parent reply s <s_member pathlink.com> writes:
In article <cbf0lp$1lvi$1 digitaldaemon.com>, Ben Hinkle says...
I'd first like to see what Sean does with std.stream plus I tend to agree
with Matthew that more discussion is needed before jumping too soon in any
one direction. There's a lot of cool stuff in mango that I'd love to see
somehow merged with std.stream if possible. Maybe I'll give a poke at
porting Kris's tokenizers and endian stuff to std.stream just to see what
it looks like.

I've gotten a bit bogged down with formatted i/o, but perhaps I'll cut some corners and just get a proof of concept done. I'm mostly interested in this as something to spur further discussion anyway.
This will probably open up rat-holes, but two quick examples of things to
discuss:

1) in mango it looks like to open a file and read it you need to create a
FileConduit and pass that to a Reader constructor. So you have to grok the
difference between Conduits and Readers/Writers (and maybe Buffers? I
notice IConduit has a createBuffer method so is it not buffered by default?
I'm not sure). In std.stream you make one object and there is less to grok.
The flexibility of mango is probably nice but it adds complexity. Each
person has a different notion of where to draw the boundaries.

Seems you have the same issues with Mango that I do :) Though in this case I think it would be fairly simple to provide some standard wrapper classes that take care of this all invisibly. If Mango were incorporated, this is something I'd push for.
2) in mango to use object serialization/deserialization you register an
instance of a class so that means at startup you basically have to
instantiate one of every class that might want to be deserialized. Seems
wastful and it could affect class design to avoid having classes that have
interdependencies.

Haven't gotten this far with the design model, but thre must be a way to change this. As you say, I can see this not working with interdependent classes. Sean
Jun 24 2004
parent Sean Kelly <sean f4.ca> writes:
Oops, something went wrong.  That 's' person is me.

Sean
Jun 24 2004
prev sibling parent "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle"  wrote
 I'd first like to see what Sean does with std.stream plus I tend to agree
 with Matthew that more discussion is needed before jumping too soon in any
 one direction. There's a lot of cool stuff in mango that I'd love to see
 somehow merged with std.stream if possible. Maybe I'll give a poke at
 porting Kris's tokenizers and endian stuff to std.stream just to see what
 it looks like.

While it's truly great (and encouraging) to see support for mango.io, I also fully agree with Matthew on this point also. It's too soon for either package to be the "standard", although I do have a bias. I would encourage those who like mango.io to please lend a hand in making it even better <G>
 1) in mango it looks like to open a file and read it you need to create a
 FileConduit and pass that to a Reader constructor. So you have to grok the
 difference between Conduits and Readers/Writers (and maybe Buffers? I
 notice IConduit has a createBuffer method so is it not buffered by

 I'm not sure). In std.stream you make one object and there is less to

 The flexibility of mango is probably nice but it adds complexity. Each
 person has a different notion of where to draw the boundaries.

Good point. The long way is to do the following: FileConduit fc = new FileConduit ("foo.bar", FileStyle.ReadExisting); Reader reader = new Reader (fc); reader >> x >> y >> x; // or reader.get(x).get(y).get(z); The reason for the split is to allow flexibility in what kind of reader you want to use (binary, text, regex, endian, whatever). It also opens up a nice way of doing Reader chaining, plus it allows one to pass any kind of IConduit to some method, and said method can decide how it wishes to 'communicate' by applying a Reader (and/or Writer) of its choice. I've always preferred having the choice of decoupled processing in this manner. There's nothing to prevent some simple wrappers being provided, such as this: TextReader tr = new TextFileReader ("foo.bar"); // or whatever There is also a split underneath the Conduit covers which seperates the Buffer itself from the Conduit. This allows one to apply any and all Readers & Writers directly upon a simple Buffer (without a socket or file attached) so there's complete symmetry in how you manipulate memory-based idioms. This is quite different from std.outbuffer, and also directly supports the notion of asking the FileConduit for a memory-mapped view instead of a standard (buffered) view. There's usually a trade-off regarding flexibility and ease-of-use.Wrappers are a great idea to make it trivial for newbies. But the "long way" is pretty simple too. I really hope mango.io can achieve both :-) The dependency chain is thus: IReader/IWriter => IBuffer => IConduit; where both left and right sides are optional. That is, the central crux of the mango.io is IBuffer. There a document over here that discusses it a bit further: http://svn.dsource.org/svn/projects/mango/trunk/Mango%20IO%20overview.pdf
 2) in mango to use object serialization/deserialization you register an
 instance of a class so that means at startup you basically have to
 instantiate one of every class that might want to be deserialized. Seems
 wastful and it could affect class design to avoid having classes that have
 interdependencies.

Yeah, there's been talk of using an Interface to do that instead. There's more on that over here http://www.dsource.org/forums/viewtopic.php?p=1264#1264 I'd also like to add that mango.io does not grok unicode at this time, cos' I think it's better to take advantage of what Hauke and AJ provide. There's a number of other missing or bogus items that you can check out over here (in the "Mango Sucks ..." thread) http://www.dsource.org/forums/viewtopic.php?t=157 I would encourage folks to add their own gripes to the (growing) list <g> - Kris
Jun 24 2004
prev sibling next sibling parent "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
-1

Premature

"Unknown lurker" <Unknown_member pathlink.com> wrote in message
news:cbepgc$1bbh$1 digitaldaemon.com...
 Maybe we should have cast a vote to convince Walter to deprecate std.stream and
 add mango.io to phobos. Please reply only with +1 / -1 / 0.

 +1

Jun 24 2004
prev sibling parent "C. Sauls" <ibisbasenji yahoo.com> writes:
-1

I love mango.io truly I do.  I've been using it for pretty much 
everything since it was still called DSC.  /But/ there are still times 
when I just want to pump something small out, and get it done fast, and 
at times std.stream is quite useful for that.  I'd just assume continue 
to have std.stream get improved (and split into multiple modules, 
frankly, ie std.io.(stream, file, mem, ...)) and keep mango as a 
"non-standard" independant open-source library that just happens to kick 
some arse.  :)

-Chris S.
-Invironz
Jun 24 2004