digitalmars.D.announce - iopipe alpha 0.0.1 version

Steven Schveighoffer (13/13) Oct 11 2017 I added a tag for iopipe and added it to the dub registry so people can

Dmitry Olshansky (4/13) Oct 11 2017 Might be able to help you on that using WinAPI for I/O. (I assume

Steven Schveighoffer (4/18) Oct 12 2017 That would be awesome! Yes, the idea is to avoid any "extra" buffering.

Steven Schveighoffer (7/26) Oct 16 2017 Dmitry hold off on this if you were going to do it. I have been looking

Dmitry Olshansky (4/21) Oct 16 2017 Meh, not that I had mich spare time to actually do anything ;)

Martin Nowak (5/13) Oct 16 2017 Started to work on unbuffered I/O, had it in mind for quite a

Steven Schveighoffer (12/26) Oct 17 2017 Awesome!

Suliman (2/4) Oct 17 2017 Yes, it would be interesting if you will get some from his lib.

Martin Nowak (11/15) Oct 17 2017 I previously collaborated a bit on that library as it was very

Martin Nowak (19/22) Oct 17 2017 I don't know yet how it will turn, but phobos is very much in

Jacob Carlborg (5/9) Oct 12 2017 I would suggest using GitHub Pages [1] for storing it.

Steven Schveighoffer (7/18) Oct 12 2017 Thanks, I used dub -b ddox to generate the documentation, then committed...

Martin Nowak (10/12) Oct 13 2017 May I recommend scod? It's just a ddox theme.

Steven Schveighoffer (5/17) Oct 13 2017 Martin, I would appreciate and I think many people would, a

Martin Nowak (3/12) Oct 16 2017 Indeed, that already crossed my mind a couple of times ;).

Bastiaan Veelo (2/15) Jan 07 2018 I am searching for a blog like that. Is it been written yet?

Martin Nowak (44/50) Oct 13 2017 Great news to see continued work on this.

Steven Schveighoffer (89/146) Oct 13 2017 This is as good a place as any :) I may create some issue reports on

Martin Nowak (23/33) Oct 19 2017 We should definitely find a @nogc solution to this, but it's a good

Steven Schveighoffer (57/92) Oct 19 2017 So the idea here (If I understand correctly) is to encapsulate the

Martin Nowak (27/122) Oct 21 2017 The above sample with the window is a bug and memory corruption because

Steven Schveighoffer (30/118) Oct 23 2017 The issue with the original code is that the window may move *within the...

Martin Nowak (8/15) Oct 24 2017 You mean chunk based processing vs. infinite lookahead for

Steven Schveighoffer (12/26) Oct 24 2017 iopipe provides "infinite" lookahead, which is central to its purpose.

Martin Nowak (14/25) Oct 24 2017 Arguably this it is somewhat hacky to use a range as end marker

Dmitry Olshansky (9/35) Oct 24 2017 I had a design like that except save returned a “mark” (not full

Steven Schveighoffer <schveiguy yahoo.com> writes:

I added a tag for iopipe and added it to the dub registry so people can 
try it out.

I didn't want to add it until I had fully documented and unittested it.

http://code.dlang.org/packages/iopipe
https://github.com/schveiguy/iopipe

If you plan on using it, expect some API changes in the near future. I 
think the next step is really to add Windows support for the IODev type.

Suggestions and ideas welcome and appreciated.

I also want to add generated documentation. Does anyone know of a good 
way to generate the ddoc (or ddox or whatever) and put it directly into 
the repository for github to serve? Would be an awesome tip for people 
making projects for code.dlang.org.

-Steve

Oct 11 2017

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On Thursday, 12 October 2017 at 04:22:01 UTC, Steven 
Schveighoffer wrote:
 I added a tag for iopipe and added it to the dub registry so 
 people can try it out.

 I didn't want to add it until I had fully documented and 
 unittested it.

 http://code.dlang.org/packages/iopipe
 https://github.com/schveiguy/iopipe

 If you plan on using it, expect some API changes in the near 
 future. I think the next step is really to add Windows support 
 for the IODev type.

Might be able to help you on that using WinAPI for I/O. (I assume 
bypassing libc is one of goals).

Oct 11 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/12/17 1:48 AM, Dmitry Olshansky wrote:
 On Thursday, 12 October 2017 at 04:22:01 UTC, Steven Schveighoffer wrote:
 I added a tag for iopipe and added it to the dub registry so people 
 can try it out.

 I didn't want to add it until I had fully documented and unittested it.

 http://code.dlang.org/packages/iopipe
 https://github.com/schveiguy/iopipe

 If you plan on using it, expect some API changes in the near future. I 
 think the next step is really to add Windows support for the IODev type.

 
 Might be able to help you on that using WinAPI for I/O. (I assume 
 bypassing libc is one of goals).

That would be awesome! Yes, the idea is to avoid any "extra" buffering. 
So using CreateFile, ReadFile, etc.

-Steve

Oct 12 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/12/17 8:41 AM, Steven Schveighoffer wrote:
 On 10/12/17 1:48 AM, Dmitry Olshansky wrote:
 On Thursday, 12 October 2017 at 04:22:01 UTC, Steven Schveighoffer wrote:
 I added a tag for iopipe and added it to the dub registry so people 
 can try it out.

 I didn't want to add it until I had fully documented and unittested it.

 http://code.dlang.org/packages/iopipe
 https://github.com/schveiguy/iopipe

 If you plan on using it, expect some API changes in the near future. 
 I think the next step is really to add Windows support for the IODev 
 type.

 Might be able to help you on that using WinAPI for I/O. (I assume 
 bypassing libc is one of goals).

 
 That would be awesome! Yes, the idea is to avoid any "extra" buffering. 
 So using CreateFile, ReadFile, etc.

Dmitry hold off on this if you were going to do it. I have been looking 
at Jason White's io library, and think I'm going to just extract all the 
low-level types he has there as a basic io library, as they are fairly 
complete, and start from there. His library includes the ability to use 
Windows.

-Steve

Oct 16 2017

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On Monday, 16 October 2017 at 14:45:21 UTC, Steven Schveighoffer 
wrote:
 On 10/12/17 8:41 AM, Steven Schveighoffer wrote:
 On 10/12/17 1:48 AM, Dmitry Olshansky wrote:
 On Thursday, 12 October 2017 at 04:22:01 UTC, Steven 
 Schveighoffer wrote:
 [...]

 Might be able to help you on that using WinAPI for I/O. (I 
 assume bypassing libc is one of goals).

 
 That would be awesome! Yes, the idea is to avoid any "extra" 
 buffering. So using CreateFile, ReadFile, etc.

 Dmitry hold off on this if you were going to do it. I have been 
 looking at Jason White's io library, and think I'm going to 
 just extract all the low-level types he has there as a basic io 
 library, as they are fairly complete, and start from there. His 
 library includes the ability to use Windows.

Meh, not that I had mich spare time to actually do anything ;)

Might help by reviewing what you have there.
 -Steve

Oct 16 2017

Martin Nowak <code dawg.eu> writes:

On Monday, 16 October 2017 at 19:36:20 UTC, Dmitry Olshansky 
wrote:
 Dmitry hold off on this if you were going to do it. I have 
 been looking at Jason White's io library, and think I'm going 
 to just extract all the low-level types he has there as a 
 basic io library, as they are fairly complete, and start from 
 there. His library includes the ability to use Windows.

 Meh, not that I had mich spare time to actually do anything ;)

 Might help by reviewing what you have there.

Started to work on unbuffered I/O, had it in mind for quite a 
while already.
http://github.com/MartinNowak/io

Oct 16 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/16/17 4:56 PM, Martin Nowak wrote:
 On Monday, 16 October 2017 at 19:36:20 UTC, Dmitry Olshansky wrote:
 Dmitry hold off on this if you were going to do it. I have been 
 looking at Jason White's io library, and think I'm going to just 
 extract all the low-level types he has there as a basic io library, 
 as they are fairly complete, and start from there. His library 
 includes the ability to use Windows.

 Meh, not that I had mich spare time to actually do anything ;)

 Might help by reviewing what you have there.

 
 Started to work on unbuffered I/O, had it in mind for quite a while 
 already.
 http://github.com/MartinNowak/io

Awesome!

Is the plan to put this into Phobos? If so, I would put it under 
std/experimental/io. However, if not, it should not be std/io.

Looks like it has all the stuff I had for my basic io type (and I see 
you have scatter read/write, that will help), so I will migrate iopipe 
to depend on it. I was thinking about using Jason White's io library, 
but I haven't seen him around in a while. Plus if this is going into 
Phobos, it would be the best thing for me to use.

Will pitch in when I can.

Thanks!

-Steve

Oct 17 2017

Suliman <evermind live.ru> writes:

I was thinking about using Jason White's io library, but I 
haven't seen him around in a while

Yes, it would be interesting if you will get some from his lib. 
He have very good API

Oct 17 2017

Martin Nowak <code dawg.eu> writes:

On Tuesday, 17 October 2017 at 13:45:02 UTC, Suliman wrote:
I was thinking about using Jason White's io library, but I 
haven't seen him around in a while

 Yes, it would be interesting if you will get some from his lib. 
 He have very good API

I previously collaborated a bit on that library as it was very 
close to the design I had in mind since years. Unfortunately the 
lib seems unmaintained now and also went somewhat off-track with 
a std.socket wrapper 
(https://github.com/jasonwhite/io/commit/3bbe43954d9c11cc892da10f656f31fff863875a#diff-a5ef7b1ce67d62
95f9bdf019adc4784). But indeed I'll try to contact him.

Furthermore I want this to be very focused (no stat/fs 
functionality beyond what's necessary), but also add hooks for 
Fiber based async event loops.

It's really not that much work to write an unbuffered I/O 
library, so let's see where that goes.

Oct 17 2017

Martin Nowak <code dawg.eu> writes:

On Tuesday, 17 October 2017 at 12:28:28 UTC, Steven Schveighoffer 
wrote:
 Is the plan to put this into Phobos? If so, I would put it 
 under std/experimental/io. However, if not, it should not be 
 std/io.

I don't know yet how it will turn, but phobos is very much in 
need of a better Files and Sockets. Certainly the ambition is to 
write a standard-worthy library.

Honestly it seems to me that the std.experimental-experiment 
didn't succeed. It's still too much overhead to develop in phobos 
(and get it reviewed/merged), there is no clear path from 
std.experimental -> std, and if sth. is well-proofed outside of 
phobos there is no point in putting it into std.experimental in 
the first place.

Developing std.io-v0.1.0 on dub until it reaches v1.0.0, seems 
like a straightforward and obvious approach. Also at our current 
community size, I'm hardly worried about namespace clashes.
Plus I'm already using std.internal.cstring as workhorse to 
support any string-like ranges (including  nogc std.path ranges) 
and core.internal.string : unsignedToTempString to avoid the fat 
and exception throwing formattedWrite (even the templated variant 
isn't nothrow).

Oct 17 2017

Jacob Carlborg <doob me.com> writes:

On 2017-10-12 06:22, Steven Schveighoffer wrote:

 I also want to add generated documentation. Does anyone know of a good 
 way to generate the ddoc (or ddox or whatever) and put it directly into 
 the repository for github to serve? Would be an awesome tip for people 
 making projects for code.dlang.org.

I would suggest using GitHub Pages [1] for storing it.

[1] https://pages.github.com

-- 
/Jacob Carlborg

Oct 12 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/12/17 3:05 AM, Jacob Carlborg wrote:
 On 2017-10-12 06:22, Steven Schveighoffer wrote:
 
 I also want to add generated documentation. Does anyone know of a good 
 way to generate the ddoc (or ddox or whatever) and put it directly 
 into the repository for github to serve? Would be an awesome tip for 
 people making projects for code.dlang.org.

 
 I would suggest using GitHub Pages [1] for storing it.
 
 [1] https://pages.github.com
 

Thanks, I used dub -b ddox to generate the documentation, then committed 
the result, looks great!

http://schveiguy.github.io/iopipe

Release 0.0.2 has fixes for the ddoc that I didn't notice before, there 
are no actual changes in the code.

-Steve

Oct 12 2017

Martin Nowak <code dawg.eu> writes:

On Thursday, 12 October 2017 at 18:08:11 UTC, Steven 
Schveighoffer wrote:
 Release 0.0.2 has fixes for the ddoc that I didn't notice 
 before, there are no actual changes in the code.

May I recommend scod? It's just a ddox theme.
https://github.com/MartinNowak/scod

I keep https://github.com/MartinNowak/bloom also as 
example/scaffold repo, it's using an automated docs setup with 
gh-branches.

Just create a doc deployment token 
(https://github.com/settings/tokens) with public_repo access and 
store that encrypted in your .travis-ci.yml.

Oct 13 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/13/17 12:49 PM, Martin Nowak wrote:
 On Thursday, 12 October 2017 at 18:08:11 UTC, Steven Schveighoffer wrote:
 Release 0.0.2 has fixes for the ddoc that I didn't notice before, 
 there are no actual changes in the code.

 
 May I recommend scod? It's just a ddox theme.
 https://github.com/MartinNowak/scod
 
 I keep https://github.com/MartinNowak/bloom also as example/scaffold 
 repo, it's using an automated docs setup with gh-branches.
 
 Just create a doc deployment token (https://github.com/settings/tokens) 
 with public_repo access and store that encrypted in your .travis-ci.yml.

Martin, I would appreciate and I think many people would, a 
blog/tutorial on how to do this.

I'll look into your suggestions on the docs, thanks!

-Steve

Oct 13 2017

Martin Nowak <code dawg.eu> writes:

On Friday, 13 October 2017 at 17:08:18 UTC, Steven Schveighoffer 
wrote:
 I keep https://github.com/MartinNowak/bloom also as 
 example/scaffold repo, it's using an automated docs setup with 
 gh-branches.
 
 Just create a doc deployment token 
 (https://github.com/settings/tokens) with public_repo access 
 and store that encrypted in your .travis-ci.yml.

 Martin, I would appreciate and I think many people would, a 
 blog/tutorial on how to do this.

Indeed, that already crossed my mind a couple of times ;).

Oct 16 2017

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Monday, 16 October 2017 at 20:58:43 UTC, Martin Nowak wrote:
 On Friday, 13 October 2017 at 17:08:18 UTC, Steven 
 Schveighoffer wrote:
 I keep https://github.com/MartinNowak/bloom also as 
 example/scaffold repo, it's using an automated docs setup 
 with gh-branches.
 
 Just create a doc deployment token 
 (https://github.com/settings/tokens) with public_repo access 
 and store that encrypted in your .travis-ci.yml.

 Martin, I would appreciate and I think many people would, a 
 blog/tutorial on how to do this.

 Indeed, that already crossed my mind a couple of times ;).

I am searching for a blog like that. Is it been written yet?

Jan 07 2018

Martin Nowak <code dawg.eu> writes:

On Thursday, 12 October 2017 at 04:22:01 UTC, Steven 
Schveighoffer wrote:
 I added a tag for iopipe and added it to the dub registry so 
 people can try it out.

 I didn't want to add it until I had fully documented and 
 unittested it.

 http://code.dlang.org/packages/iopipe
 https://github.com/schveiguy/iopipe

Great news to see continued work on this.

I'll just use this thread to get started on design discussions. 
If there is there a better place for that, let me know ;).

Questions/Ideas

- You can move docs out of the repo to fix search, e.g. by 
pushing them to a `gh-pages` branch of your repo. See 
https://github.com/MartinNowak/bloom/blob/736dc7a7ffcd2bbca7997f273a09e272e0
84596/travis.sh#L13 for an automated setup using Travis-CI and ddox/scod.

- Standard device implementation?

   You library already has the notion of devices as thin 
abstractions over file/socket handles.
   Should we start with such an unbuffered IO library as 
foundation including support hooks for Fiber based event loops. 
Something along the lines of https://code.dlang.org/packages/io? 
Without a standard device lib, IOPipe could not be used in APIs.

   Easy enough to write, could be written over a weekend.

- What's the plan for  safe buffer/window invalidation, right now 
you're handing out raw access to internal buffers with an 
inherent memory safety problem.

   ```d
   auto w = f.window();
   f.extend(random());
   w[0]; // ⚡ dangling pointer ⚡
   ```

   I can see how the compiler could catch that if we'd go with 
compile-time enforced safety for RC and friends. But that's still 
unclear atm. and we might end up with a runtime RC/weak ptr 
mechanism instead, which wouldn't be too good a fit for that 
window mechanism.

- What about the principle that the caller should choose 
allocation/ownership?
   Having an extend methods means the IOPipe is responsible for 
growing/allocating buffers, so you'll end up with IOPipeMalloc, 
IOPipeGC, IOPipeAllocatorGrowExp (or their template 
alternatives), not very nice for APIs.

- Why continuous memory? The current implementations reallocs and 
even weirder memmoves data in extend.
   
https://github.com/schveiguy/iopipe/blob/3589a4c9fc72b844eb4efd3ae718773faf9ab9ed/source/iopipe/buffer.d#L171
   Shouldn't a modern IO library be as zero-copy as possible?
   The docs say random access, that should be supported by 
ringbuffers or lists/arrays of buffers. Any plans towards that 
direction?

Oct 13 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/13/17 11:59 AM, Martin Nowak wrote:
On Thursday, 12 October 2017 at 04:22:01 UTC, Steven Schveighoffer wrote:
I added a tag for iopipe and added it to the dub registry so people
can try it out.

I didn't want to add it until I had fully documented and unittested it.

http://code.dlang.org/packages/iopipe
https://github.com/schveiguy/iopipe

Great news to see continued work on this.

I'll just use this thread to get started on design discussions. If there
is there a better place for that, let me know ;).

This is as good a place as any :) I may create some issue reports on
github to track things better.

Questions/Ideas

- You can move docs out of the repo to fix search, e.g. by pushing them
to a `gh-pages` branch of your repo.

When I tried the search it seemed to work...

See
https://github.com/MartinNowak/bloom/blob/736dc7a7ffcd2bbca7997f273a09e272e0
84596/travis.sh#L13
for an automated setup using Travis-CI and ddox/scod.

I admit complete ignorance on this, I need to look into it, but at the
moment, I'm OK with committing the generated docs directly as an ugly
extra step. When I looked at the options under adding a "pages" piece
for the project that if I put things under "docs" directory, it could
use that, so that's what I went with.

- Standard device implementation?

You library already has the notion of devices as thin abstractions
over file/socket handles.
Should we start with such an unbuffered IO library as foundation
including support hooks for Fiber based event loops. Something along the
lines of https://code.dlang.org/packages/io? Without a standard device
lib, IOPipe could not be used in APIs.

I absolutely think this would be a great idea. In fact, you could use
Jason White's io package with iopipes directly, as his low-level types
have the necessary read function:
https://github.com/jasonwhite/io/blob/master/source/io/file/stream.d#L335

Perhaps we could coax the basic types out of that library to provide a
base for both iopipe and his high-level stuff. The stream portion of my
library is really just a throwaway piece that is not a focus of the
library. Indeed, I created it because unbuffered stream types didn't
exist anywhere (the IODev type predates iopipe, as it was part of my
original attempt to rewrite Phobos io).

- What's the plan for safe buffer/window invalidation, right now you're
handing out raw access to internal buffers with an inherent memory
safety problem.

I don't plan to put any restrictions on this. In fact the core purpose
of iopipe is to give raw buffer access to aid in writing higher-level
routines around it. As I said here:
https://github.com/schveiguy/iopipe/blob/master/source/iopipe/buffer.d#L217

If the Allocator supports deallocation I call it, but it may not be the
correct thing to do. There is a sticky point in
std.experiemental.allocator: the GC allocator defines deallocate,
because it's available, but the *presence* of that member may be taken
to mean you have to call it to deallocate. There is no member saying
whether deallocation is optional.

In my wrapper GCNoPointerAllocator (which I needed to support allocating
ubyte buffers without having to scan them), I leave out the deallocate
function, so technically it's safe with that allocator.

I will say though, at some point, I'm going to focus on making safe as
much as possible in iopipe. That may require using the GC for buffering.

```d
auto w = f.window();
f.extend(random());
w[0]; // ⚡ dangling pointer ⚡
```

I can see how the compiler could catch that if we'd go with
compile-time enforced safety for RC and friends. But that's still
unclear atm. and we might end up with a runtime RC/weak ptr mechanism
instead, which wouldn't be too good a fit for that window mechanism.

What would be nice is a mechanism to detect this situation, since the
above is both un- safe and incorrect code.

Possibly you could instrument a window with a mechanism to check to see
if it's still correct on every access, to be used when compiled in
non-release mode for checking program correctness.

But in terms of safe code in release mode, I think the only option is
really to rely on the GC or reference counting to allow the window to
still exist.

- What about the principle that the caller should choose
allocation/ownership?

It can, BufferManager takes an Allocator compile-time option.

It's also possible to create your own ownership or allocation scheme as
long as you implement the required iopipe methods.

Having an extend methods means the IOPipe is responsible for
growing/allocating buffers, so you'll end up with IOPipeMalloc,
IOPipeGC, IOPipeAllocatorGrowExp (or their template alternatives), not
very nice for APIs.

extend is a core part of the iopipe system. The point of the library is
that you don't have to manage the buffering or allocation of your
higher-level code in terms of memory ownership or allocation. I've used
so many buffered streams where I have to still create my own buffer
because of a quirk in the way I have to process the data doesn't fit the
API of the stream. This mitigates that by giving you direct control over
how much data should be buffered, but not burdening you with the details
of managing that memory. The mechanism was clear to me in Dmitry
Olshansky's simple back-reference toy library that he made a while back
(and actually was the inspiration for making iopipe instead of what I
was doing before).

I can't find his library any more, but here is the post he made:

https://forum.dlang.org/post/l9q66g$2he3$1 digitalmars.com

- Why continuous memory? The current implementations reallocs and even
weirder memmoves data in extend.
https://github.com/schveiguy/iopipe/blob/3589a4c9fc72b844eb4efd3ae718773faf9ab9ed/source/
opipe/buffer.d#L171

Shouldn't a modern IO library be as zero-copy as possible?
The docs say random access, that should be supported by ringbuffers
or lists/arrays of buffers. Any plans towards that direction?

Yes and no :)

My original idea was that once I got simple array buffers working, I
would move on to circular buffers, and linked lists of buffers, etc,
with all the details hidden by the range itself. I still might implement
this. Windows and Posix support the notion of scatter read so you can
easily implement a way for streams to fit perfectly on top of these things.

But what I realized is that in practice (and especially when battling to
beat Phobos byLine and libc's getline), avoiding copying may not be as
important as I thought. For one thing, the focused data (the data you
care about currently) is generally much smaller than the real buffer
size. So when it is calling memmove, you are generally only moving a
tiny piece of the buffer.

Second, the CPU is really good at dealing with arrays (and searching
through arrays), especially when dereferencing data.

Third, every single access to a non-array is going to have to go through
some mechanism to check which actual array the index falls into. When
implementing iopipe's byline, I got a SIGNIFICANT speedup by copying
members of the ByLine struct (e.g. the dchar being searched for) into a
local variable. If you have a custom range for a circular buffer whose
division point has to be read on every element index, the penalties are
going to add up.

The trade-offs might still be worth it. For instance if your focused
data is a larger percentage of the total buffer (like 70%), moving it to
the front of the buffer is going to hurt performance. I don't know
whether it would overcome slower access per element. The good news is, I
can implement it, and see how it fares, since the higher level code is
abstracted to the buffer type.

And of course, any existing (non-infinite) random-access range can be
hooked as a non-extendable iopipe (see how arrays are hooked).

Thanks for all your thoughts on this, Martin!

-Steve

Oct 13 2017

Martin Nowak <code+news.digitalmars dawg.eu> writes:

On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
 What would be nice is a mechanism to detect this situation, since the
 above is both un- safe and incorrect code.
 
 Possibly you could instrument a window with a mechanism to check to see
 if it's still correct on every access, to be used when compiled in
 non-release mode for checking program correctness.
 
 But in terms of  safe code in release mode, I think the only option is
 really to rely on the GC or reference counting to allow the window to
 still exist.

We should definitely find a  nogc solution to this, but it's a good
litmus test for the RC compiler support I'll work on.
Why do IOPipe have to hand over the window to the caller?
They could just implement the RandomAccessRange interface themselves.

Instead of
```d
auto w = f.window();
f.extend(random());
w[0];
```
you could only do
```d
f[0];
f.extend(random());
f[0]; // bug, but no memory corruption
```

This problem seems to be very similar to the Range vs. Iterators
difference, the former can perform bounds checks on indexing, the later
are inherently unsafe (with expensive runtime debug checks e.g. in VC++).
Similarly always accessing the buffer through IOPipe would allow cheap
bounds checking, and sure you could still offer IOPipe.ptr for unsafe code.

-Martin

Oct 19 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/19/17 7:13 AM, Martin Nowak wrote:
 On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
 What would be nice is a mechanism to detect this situation, since the
 above is both un- safe and incorrect code.

 Possibly you could instrument a window with a mechanism to check to see
 if it's still correct on every access, to be used when compiled in
 non-release mode for checking program correctness.

 But in terms of  safe code in release mode, I think the only option is
 really to rely on the GC or reference counting to allow the window to
 still exist.

 
 We should definitely find a  nogc solution to this, but it's a good
 litmus test for the RC compiler support I'll work on.
 Why do IOPipe have to hand over the window to the caller?
 They could just implement the RandomAccessRange interface themselves.
 
 Instead of
 ```d
 auto w = f.window();
 f.extend(random());
 w[0];
 ```
 you could only do
 ```d
 f[0];
 f.extend(random());
 f[0]; // bug, but no memory corruption
 ```

So the idea here (If I understand correctly) is to encapsulate the 
window into the pipe, such that you don't need to access the buffer 
separately? I'm not quite sure because of that last comment. If f[0] is 
equivalent to previous code f.window[0], then the second f[0] is not a 
bug, it's valid, and accessing the first element of the window (which 
may have moved).

But let me assume that was just a misunderstanding...

 
 This problem seems to be very similar to the Range vs. Iterators
 difference, the former can perform bounds checks on indexing, the later
 are inherently unsafe (with expensive runtime debug checks e.g. in VC++).

But ranges have this same problem.

For instance:
const(char[])[] lines = stdin.byLine.array;

Here, since byLine uses GC buffering, it's  safe (but wrong). If non-GC 
buffers are used, then it's not  safe.

I think as long as the windows are backed by GC data, it should be 
 safe. In this sense, your choice of buffering scheme can make something 
 safe or not  safe. I'm OK with that, as long as iopipes can be  safe in 
some way (and that happens to be the default).

 Similarly always accessing the buffer through IOPipe would allow cheap
 bounds checking, and sure you could still offer IOPipe.ptr for unsafe code.

It's an interesting idea to simply make the iopipe the window, not just 
for  safety reasons:

1. this means the iopipe itself *is* a random access range, allowing it 
to automatically fit into existing algorithms.
2. Existing random-access ranges can be easily shoehorned into being 
ranges (I already did it with arrays, and it's not much harder with 
popFrontN). Alternatively, code that uses iopipes can simply check for 
the existence of iopipe-like methods, and use them if they are present.
3. Less verbose usage, and more uniform access. For instance if an 
iopipe defines opIndex, then iopipe.window[0] and iopipe[0] are possibly 
different things, which would be confusing.

Some downsides however:

1. iopipes can be complex and windows are not. They were a fixed view of 
the current buffer. The idea that I can fetch a window of data from an 
iopipe and then deal simply with that part of the data was attractive.

2. The iopipe is generally not copyable once usage begins. In other 
words, the feature of ranges that you can copy them and they just work, 
would be difficult to replicate in iopipe.

A possible way forward could be:

* iopipe is a random-access range (not necessarily a forward range).
* iopipe.window returns a non-extendable window of the buffer itself, 
which is a forward/random-access range. If backed by the GC or some form 
of RC, everything is  safe.
* Functions which now take iopipes could be adjusted to take 
random-access ranges, and if they are also iopipes, could use the extend 
features to get more data.
* iopipe.release(size_t) could be hooked by popFrontN. I don't like the 
idea of supporting slicing on iopipes, for the non-forward aspect of 
iopipe. Much better to have an internal hook that modifies the range 
in-place.

This would make iopipes fit right into the range hierarchy, and 
therefore could be integrated easily into Phobos.

In fact, I can accomplish most of this by simply adding the appropriate 
range operations to iopipes. I have resisted this in the past but I 
can't see how it hurts.

For Phobos inclusion, however, I don't know how to reconcile 
auto-decoding. I absolutely need to treat buffers of char, wchar, and 
dchar data as normal buffers, and not something else. This one thing may 
keep it from getting accepted.

-Steve

Oct 19 2017

Martin Nowak <code+news.digitalmars dawg.eu> writes:

On 10/19/2017 03:12 PM, Steven Schveighoffer wrote:
 On 10/19/17 7:13 AM, Martin Nowak wrote:
 On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
 What would be nice is a mechanism to detect this situation, since the
 above is both un- safe and incorrect code.

 Possibly you could instrument a window with a mechanism to check to see
 if it's still correct on every access, to be used when compiled in
 non-release mode for checking program correctness.

 But in terms of  safe code in release mode, I think the only option is
 really to rely on the GC or reference counting to allow the window to
 still exist.

 We should definitely find a  nogc solution to this, but it's a good
 litmus test for the RC compiler support I'll work on.
 Why do IOPipe have to hand over the window to the caller?
 They could just implement the RandomAccessRange interface themselves.

 Instead of
 ```d
 auto w = f.window();
 f.extend(random());
 w[0];
 ```
 you could only do
 ```d
 f[0];
 f.extend(random());
 f[0]; // bug, but no memory corruption
 ```

 
 So the idea here (If I understand correctly) is to encapsulate the
 window into the pipe, such that you don't need to access the buffer
 separately? I'm not quite sure because of that last comment. If f[0] is
 equivalent to previous code f.window[0], then the second f[0] is not a
 bug, it's valid, and accessing the first element of the window (which
 may have moved).

The above sample with the window is a bug and memory corruption because
of iterator/window invalidation by extend.
If you didn't thought of the invalidation, then the latter example would
still be a bug to you, but not a memory corruption.

 This problem seems to be very similar to the Range vs. Iterators
 difference, the former can perform bounds checks on indexing, the later
 are inherently unsafe (with expensive runtime debug checks e.g. in VC++).

 
 But ranges have this same problem.
 
 For instance:
 const(char[])[] lines = stdin.byLine.array;
 
 Here, since byLine uses GC buffering, it's  safe (but wrong). If non-GC
 buffers are used, then it's not  safe.
 
 I think as long as the windows are backed by GC data, it should be
  safe. In this sense, your choice of buffering scheme can make something
  safe or not  safe. I'm OK with that, as long as iopipes can be  safe in
 some way (and that happens to be the default).
 
 Similarly always accessing the buffer through IOPipe would allow cheap
 bounds checking, and sure you could still offer IOPipe.ptr for unsafe
 code.

 
 It's an interesting idea to simply make the iopipe the window, not just
 for  safety reasons:
 
 1. this means the iopipe itself *is* a random access range, allowing it
 to automatically fit into existing algorithms.
 2. Existing random-access ranges can be easily shoehorned into being
 ranges (I already did it with arrays, and it's not much harder with
 popFrontN). Alternatively, code that uses iopipes can simply check for
 the existence of iopipe-like methods, and use them if they are present.
 3. Less verbose usage, and more uniform access. For instance if an
 iopipe defines opIndex, then iopipe.window[0] and iopipe[0] are possibly
 different things, which would be confusing.
 
 Some downsides however:
 
 1. iopipes can be complex and windows are not. They were a fixed view of
 the current buffer. The idea that I can fetch a window of data from an
 iopipe and then deal simply with that part of the data was attractive.

You could still have a window internally and just forward to that.

 2. The iopipe is generally not copyable once usage begins. In other
 words, the feature of ranges that you can copy them and they just work,
 would be difficult to replicate in iopipe.

That's a general problem. Unique ownership is really useful, but most
phobos range methods don't care, and assume copying is implicit saving.
Not too nice and I guess this will bite us again with RC/Unique/Weak.

The current workaround for this is `refRange`.

 A possible way forward could be:
 
 * iopipe is a random-access range (not necessarily a forward range).
 * iopipe.window returns a non-extendable window of the buffer itself,
 which is a forward/random-access range. If backed by the GC or some form
 of RC, everything is  safe.
 * Functions which now take iopipes could be adjusted to take
 random-access ranges, and if they are also iopipes, could use the extend
 features to get more data.
 * iopipe.release(size_t) could be hooked by popFrontN. I don't like the
 idea of supporting slicing on iopipes, for the non-forward aspect of
 iopipe. Much better to have an internal hook that modifies the range
 in-place.
 
 This would make iopipes fit right into the range hierarchy, and
 therefore could be integrated easily into Phobos.

I made an interesting experiment with buffered input ranges quite a
while ago.
https://gist.github.com/MartinNowak/1257196

This would use popFront to fetch new data and ref-counts a list of
buffers depending on older saved ranges still using earlier buffers.
With a bit of creative use, the existing Range primitives could be used
to implement infinite look-ahead.

auto beg = rng.save;
auto end = rng.find("bla");
auto window = beg[0 .. end]; // get a random access window

The main problem with this has been, that the many implicit copies (e.g.
in foreach) bump the reference-count, so the RC buffer release would
often not work.
Could be avoided by making them non-copyable, but again phobos and
foreach currently don't support this hybrid of input (consuming) and
forward (saveable) range.

-Martin

Oct 21 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/21/17 6:33 AM, Martin Nowak wrote:
 On 10/19/2017 03:12 PM, Steven Schveighoffer wrote:
 On 10/19/17 7:13 AM, Martin Nowak wrote:
 On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:
 What would be nice is a mechanism to detect this situation, since the
 above is both un- safe and incorrect code.

 Possibly you could instrument a window with a mechanism to check to see
 if it's still correct on every access, to be used when compiled in
 non-release mode for checking program correctness.

 But in terms of  safe code in release mode, I think the only option is
 really to rely on the GC or reference counting to allow the window to
 still exist.

 We should definitely find a  nogc solution to this, but it's a good
 litmus test for the RC compiler support I'll work on.
 Why do IOPipe have to hand over the window to the caller?
 They could just implement the RandomAccessRange interface themselves.

 Instead of
 ```d
 auto w = f.window();
 f.extend(random());
 w[0];
 ```
 you could only do
 ```d
 f[0];
 f.extend(random());
 f[0]; // bug, but no memory corruption
 ```

 So the idea here (If I understand correctly) is to encapsulate the
 window into the pipe, such that you don't need to access the buffer
 separately? I'm not quite sure because of that last comment. If f[0] is
 equivalent to previous code f.window[0], then the second f[0] is not a
 bug, it's valid, and accessing the first element of the window (which
 may have moved).

 
 The above sample with the window is a bug and memory corruption because
 of iterator/window invalidation by extend.
 If you didn't thought of the invalidation, then the latter example would
 still be a bug to you, but not a memory corruption.

The issue with the original code is that the window may move *within the 
buffer*. That is, if your current window is looking at the last 1k of a 
2M buffer, and you extend, the buffer manager may move the data from the 
end of the buffer to the beginning, and re-fill the rest of the buffer 
with new data from the source.

In this case, the old window reference that you saved is pointing at 
completely different data. That is, f.window[0] may not be the same as 
w[0]. Still  safe, but not correct.

Whereas in your new code, you are looking at the correct window data 
every time.

 Some downsides however:

 1. iopipes can be complex and windows are not. They were a fixed view of
 the current buffer. The idea that I can fetch a window of data from an
 iopipe and then deal simply with that part of the data was attractive.

 
 You could still have a window internally and just forward to that.

My attention is really on algorithms that may use the range interface. 
It may be less efficient and maybe not even correct to use the whole 
iopipe as a range. At first look, I wanted to create an abstraction on 
the data itself, and then build a range on top of it. It's a different 
way to look at it.

 2. The iopipe is generally not copyable once usage begins. In other
 words, the feature of ranges that you can copy them and they just work,
 would be difficult to replicate in iopipe.

 
 That's a general problem. Unique ownership is really useful, but most
 phobos range methods don't care, and assume copying is implicit saving.
 Not too nice and I guess this will bite us again with RC/Unique/Weak.
 
 The current workaround for this is `refRange`.

There is actually quite a bit of this problem in Phobos. Most range 
wrapper functions do not take ranges by reference, but by value, making 
copies everywhere. However, most of the time, this is only during 
construction, where the copy is a move.

But many of the functions do not actually move the parameters into the 
wrapper, so disabling postblit would be horrific.

iopipe, unfortunately, follows that precedent. I should probably correct it.

 A possible way forward could be:

 * iopipe is a random-access range (not necessarily a forward range).
 * iopipe.window returns a non-extendable window of the buffer itself,
 which is a forward/random-access range. If backed by the GC or some form
 of RC, everything is  safe.
 * Functions which now take iopipes could be adjusted to take
 random-access ranges, and if they are also iopipes, could use the extend
 features to get more data.
 * iopipe.release(size_t) could be hooked by popFrontN. I don't like the
 idea of supporting slicing on iopipes, for the non-forward aspect of
 iopipe. Much better to have an internal hook that modifies the range
 in-place.

 This would make iopipes fit right into the range hierarchy, and
 therefore could be integrated easily into Phobos.

 
 I made an interesting experiment with buffered input ranges quite a
 while ago.
 https://gist.github.com/MartinNowak/1257196
 
 This would use popFront to fetch new data and ref-counts a list of
 buffers depending on older saved ranges still using earlier buffers.
 With a bit of creative use, the existing Range primitives could be used
 to implement infinite look-ahead.
 
 auto beg = rng.save;
 auto end = rng.find("bla");
 auto window = beg[0 .. end]; // get a random access window

This is similar to Dmitry's attempt as well (which unfortunately is no 
longer available that I can see), but his did not use the range 
primitives I think.

It's solving a different problem than iopipe is solving. I plan on 
adding iopipe-on-range capability soon as well, since many times, all 
you have is a range.

-Steve

Oct 23 2017

Martin Nowak <code dawg.eu> writes:

On Monday, 23 October 2017 at 16:34:19 UTC, Steven Schveighoffer 
wrote:
 On 10/21/17 6:33 AM, Martin Nowak wrote:
 On 10/19/2017 03:12 PM, Steven Schveighoffer wrote:
 On 10/19/17 7:13 AM, Martin Nowak wrote:
 On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:



 It's solving a different problem than iopipe is solving. I plan 
 on adding iopipe-on-range capability soon as well, since many 
 times, all you have is a range.

You mean chunk based processing vs. infinite lookahead for 
parsing?
They both provide a similar API, sth. to extend the current 
window and sth. to release data.
The example input here was an input range, but it's read in page 
sizes and could as well be a socket.

Oct 24 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 10/24/17 5:32 AM, Martin Nowak wrote:
 On Monday, 23 October 2017 at 16:34:19 UTC, Steven Schveighoffer wrote:
 On 10/21/17 6:33 AM, Martin Nowak wrote:
 On 10/19/2017 03:12 PM, Steven Schveighoffer wrote:
 On 10/19/17 7:13 AM, Martin Nowak wrote:
 On 10/13/2017 08:39 PM, Steven Schveighoffer wrote:



 It's solving a different problem than iopipe is solving. I plan on 
 adding iopipe-on-range capability soon as well, since many times, all 
 you have is a range.

 
 You mean chunk based processing vs. infinite lookahead for parsing?
 They both provide a similar API, sth. to extend the current window and 
 sth. to release data.

Yes, definitely.

 The example input here was an input range, but it's read in page sizes 
 and could as well be a socket.

iopipe provides "infinite" lookahead, which is central to its purpose. 
The trouble with bolting that on top of ranges, as you said, is that we 
have to copy everything out of the range, which necessarily buffers 
somehow (if it's efficient i/o), so you are double buffering. iopipe's 
purpose is to get rid of this unnecessary buffering. This is why it's a 
great fit for being the *base* of a range.

In other words, if you want something to have optional lookahead and 
range support, it's better to start out with an extendable buffering 
type like an iopipe, and bolt ranges on top, vs. the other way around.

-Steve

Oct 24 2017

Martin Nowak <code dawg.eu> writes:

On Tuesday, 24 October 2017 at 14:47:02 UTC, Steven Schveighoffer 
wrote:
 iopipe provides "infinite" lookahead, which is central to its 
 purpose. The trouble with bolting that on top of ranges, as you 
 said, is that we have to copy everything out of the range, 
 which necessarily buffers somehow (if it's efficient i/o), so 
 you are double buffering. iopipe's purpose is to get rid of 
 this unnecessary buffering. This is why it's a great fit for 
 being the *base* of a range.

 In other words, if you want something to have optional 
 lookahead and range support, it's better to start out with an 
 extendable buffering type like an iopipe, and bolt ranges on 
 top, vs. the other way around.

Arguably this it is somewhat hacky to use a range as end marker 
for slicing sth., but you'd get the same benefit, access to the 
random buffer with zero-copying.

auto beg = rng.save; // save current position
auto end = rng.find("bla"); // lookahead using popFront
auto window = beg[0 .. end]; // get a random access window to 
underlying buffer

So basically forward ranges with slicing.
At least that would require to extend all algorithms with 
`extend` support, though likely you could have a small extender 
proxy range for IOPipes.

Note that rng could be a wrapper around unbuffered IO reads.

Oct 24 2017

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On Tuesday, 24 October 2017 at 19:05:02 UTC, Martin Nowak wrote:
 On Tuesday, 24 October 2017 at 14:47:02 UTC, Steven 
 Schveighoffer wrote:
 iopipe provides "infinite" lookahead, which is central to its 
 purpose. The trouble with bolting that on top of ranges, as 
 you said, is that we have to copy everything out of the range, 
 which necessarily buffers somehow (if it's efficient i/o), so 
 you are double buffering. iopipe's purpose is to get rid of 
 this unnecessary buffering. This is why it's a great fit for 
 being the *base* of a range.

 In other words, if you want something to have optional 
 lookahead and range support, it's better to start out with an 
 extendable buffering type like an iopipe, and bolt ranges on 
 top, vs. the other way around.

 Arguably this it is somewhat hacky to use a range as end marker 
 for slicing sth., but you'd get the same benefit, access to the 
 random buffer with zero-copying.

 auto beg = rng.save; // save current position
 auto end = rng.find("bla"); // lookahead using popFront
 auto window = beg[0 .. end]; // get a random access window to 
 underlying buffer


I had a design like that except save returned a “mark” (not full 
range) and there was a slice primitive. It even worked with 
patched std.regex, but at a non-zero performance penalty.

I think that maintaining the illusion of a full copy of range 
when you do “save” for buffered I/O stream is too costly. Because 
a user can now legally advance both - you need to RC buffers 
behind the scenes with separate “pointers” for each range that 
effectively pin them.

 So basically forward ranges with slicing.
 At least that would require to extend all algorithms with 
 `extend` support, though likely you could have a small extender 
 proxy range for IOPipes.

 Note that rng could be a wrapper around unbuffered IO reads.

Oct 24 2017

D Programming

C/C++ Programming

Other

digitalmars.D.announce - iopipe alpha 0.0.1 version