www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - D2 byChunk

Hi all,

I currently work on a parser for some file format. I wanted to use the
std.stdio.ByChunk Range to read from a file and extract tokens from the
chunks. Obviously it can happen that the current chunk ends before a
token can be extracted, in which case I can ask for the next chunk from
the Range. In order to keep the already-read part in mind, I need to dup
at least the unprocessed part of the older chunk and concatenate it in
front of the next part or at least write the code that works like they
were concatenated. This looks like a stupid approach to me.

Here is a small example:

file contents: "Hello world"
chunks: "Hello w" "orld"

First I read the token "Hello" from the first chunk and maybe skip the
whitespace. Then I have the "w" (which I need to move away from the
buffer, because ByChunk fill overwrite it) and get "orld".

My idea was to have a ByChunk-related Object, which the user can tell
how much of the buffer he/she actually used, such that it can move this
data to the beginning of the buffer and append the next chunk. This
wouldn't need further allocations and give the user contiguous data
he/she can work with.

Does this approach make sense or am I doing something completely wrong?
The implementation would be not too complicated (although violating the
Range contract, AFAICS). But maybe there are other/better ways to go.

Thank you!
Dec 10 2010