www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Scope Containers

reply bitwise <bitwise.pvt gmail.com> writes:
Hi,

Does anybody know if there are any container libraries that make 
use of scope for iterator/range safety? or is the implementation 
of scope even ready for that right now?

Thanks
Mar 09 2019
next sibling parent reply Eugene Wissner <belka caraus.de> writes:
On Saturday, 9 March 2019 at 16:14:34 UTC, bitwise wrote:
 Hi,

 Does anybody know if there are any container libraries that 
 make use of scope for iterator/range safety? or is the 
 implementation of scope even ready for that right now?

 Thanks
tanya's containers are good candidates for scope/return, since they don't rely on reference counting and GC like Phobos containers. I intend to introduce scope/return ranges soon, because currently using the containers is not very safe. But honestly I'm not expecting much from these annotations. Last time I tested "return" with dip25 it could catch only some basic cases, but not more complicated cases where I assign ranges to variables and then use them.
Mar 09 2019
parent bitwise <bitwise.pvt gmail.com> writes:
On Saturday, 9 March 2019 at 16:34:18 UTC, Eugene Wissner wrote:
 On Saturday, 9 March 2019 at 16:14:34 UTC, bitwise wrote:
 Hi,

 Does anybody know if there are any container libraries that 
 make use of scope for iterator/range safety? or is the 
 implementation of scope even ready for that right now?

 Thanks
tanya's containers are good candidates for scope/return, since they don't rely on reference counting and GC like Phobos containers. I intend to introduce scope/return ranges soon, because currently using the containers is not very safe. But honestly I'm not expecting much from these annotations. Last time I tested "return" with dip25 it could catch only some basic cases, but not more complicated cases where I assign ranges to variables and then use them.
Ok thanks, I'll have a look. All I really need is scoped return values so ranges/iterators and such can't leave the scope of their containers. So even a partial implementation would suffice if it was reliable.
Mar 09 2019
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Saturday, 9 March 2019 at 16:14:34 UTC, bitwise wrote:
 Hi,

 Does anybody know if there are any container libraries that 
 make use of scope for iterator/range safety? or is the 
 implementation of scope even ready for that right now?

 Thanks
I wrote this: https://github.com/atilaneves/automem/blob/master/source/automem/vector.d
Mar 10 2019
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Sunday, 10 March 2019 at 12:06:10 UTC, Atila Neves wrote:
 I wrote this:

 https://github.com/atilaneves/automem/blob/master/source/automem/vector.d
I'm confused by the use of scope. For example, what effect does scope have here? this(this) scope { ... } I don't see any ranges/iterators in that class either, which is what's at issue for me. At present, it seems impossible to create a memory-safe container that supports iterators or ranges in D, unless it's entirely GC. The best attempt I've seen is std.Array, which uses a ref-counted/pointer-to-pointer payload to make sure any ranges that were given out can see the changes to the payload. Isn't it specified that a struct should be moveable in D though? So as soon as you memcpy an std.Array everything starts exploding/double-deleting. Also, this seems like it would apply to *any* library based ref-counted implementation. So it seems like scope is the last hope for creating an safe nogc container that can give out ranges/iterators. Even adding a move-constructor to D seems like a no-go since it's expected that structs can be memcpy'ed. Am I wrong about this? Thanks
Mar 10 2019
parent reply Atila Neves <atila.neves gmail.com> writes:
On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 12:06:10 UTC, Atila Neves wrote:
 I wrote this:

 https://github.com/atilaneves/automem/blob/master/source/automem/vector.d
I'm confused by the use of scope. For example, what effect does scope have here? this(this) scope { ... }
Like `const`, `scope` on a member function means that the `this` reference is `scope`.
 I don't see any ranges/iterators in that class either,
https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145 static assert(isInputRange!(Vector!int));
 At present, it seems impossible to create a memory-safe 
 container that supports iterators or ranges in D, unless it's 
 entirely GC.

 The best attempt I've seen is std.Array, which uses a 
 ref-counted/pointer-to-pointer payload to make sure any ranges 
 that were given out can see the changes to the payload. Isn't 
 it specified that a struct should be moveable in D though? So 
 as soon as you memcpy an std.Array everything starts 
 exploding/double-deleting. Also, this seems like it would apply 
 to *any* library based ref-counted implementation. So it seems 
 like scope is the last hope for creating an  safe  nogc 
 container that can give out ranges/iterators. Even adding a 
 move-constructor to D seems like a no-go since it's expected 
 that structs can be memcpy'ed.

 Am I wrong about this?
I would think so, especially since I wrote Vector to be safe with dip1000. Look at this unit test for not being able to escape the payload for example: https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
Mar 10 2019
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 12:06:10 UTC, Atila Neves wrote:
 [...]
I'm confused by the use of scope. For example, what effect does scope have here? this(this) scope { ... }
Like `const`, `scope` on a member function means that the `this` reference is `scope`.
Interesting. I'll have to look up some examples.
 I don't see any ranges/iterators in that class either,
https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145 static assert(isInputRange!(Vector!int));
 At present, it seems impossible to create a memory-safe 
 container that supports iterators or ranges in D, unless it's 
 entirely GC.

 The best attempt I've seen is std.Array, which uses a 
 ref-counted/pointer-to-pointer payload to make sure any ranges 
 that were given out can see the changes to the payload. Isn't 
 it specified that a struct should be moveable in D though? So 
 as soon as you memcpy an std.Array everything starts 
 exploding/double-deleting. Also, this seems like it would 
 apply to *any* library based ref-counted implementation. So it 
 seems like scope is the last hope for creating an  safe  nogc 
 container that can give out ranges/iterators. Even adding a 
 move-constructor to D seems like a no-go since it's expected 
 that structs can be memcpy'ed.

 Am I wrong about this?
I would think so, especially since I wrote Vector to be safe with dip1000. Look at this unit test for not being able to escape the payload for example: https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
Ok, but this container copies the entire collection in postblit: https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104 So if you try to treat it like a range, you'll take a performance hit every time you trigger the postblit, which seems likely. Isn't it reasonable to assume a range should act like a reference to data, rather than a copy of it?
Mar 10 2019
next sibling parent Alex <sascha.orlov gmail.com> writes:
On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 Isn't it reasonable to assume a range should act like a 
 reference to data, rather than a copy of it?
I think, this highly depends on data and the container. What if data is stored inside the container? How the container can reference data if it is the one and only holder? In which format do you expect the container to release the raw data? How do you suppose to synchronize a data movement and references to it, if the container contains only references? But what, if data may not be copied or moved, but the container manages the data by copy? Is data assumed to be destroyed/invalidated after the container is destroyed? And why?
Mar 10 2019
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 [...]
Like `const`, `scope` on a member function means that the `this` reference is `scope`.
Interesting. I'll have to look up some examples.
 [...]
https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145 static assert(isInputRange!(Vector!int));
 [...]
I would think so, especially since I wrote Vector to be safe with dip1000. Look at this unit test for not being able to escape the payload for example: https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
Ok, but this container copies the entire collection in postblit: https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104
Well, yes. As it says in the module ddoc, it's my version of C++'s std::vector. If you don't want it to copy, pass by ref.
 So if you try to treat it like a range, you'll take a 
 performance hit every time you trigger the postblit, which 
 seems likely.
You can pass it by ref like I mention above, or you could slice it and pass the slice instead. Without DIP1000 passing the slice wouldn't be safe but with it, it is. Again, just like std::vector, which is pretty much always passed by const ref. But safer and with a default allocator.
 Isn't it reasonable to assume a range should act like a 
 reference to data, rather than a copy of it?
It depends.
Mar 10 2019
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 [...]
Like `const`, `scope` on a member function means that the `this` reference is `scope`.
Interesting. I'll have to look up some examples.
 [...]
https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145 static assert(isInputRange!(Vector!int));
 [...]
I would think so, especially since I wrote Vector to be safe with dip1000. Look at this unit test for not being able to escape the payload for example: https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
Ok, but this container copies the entire collection in postblit: https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104
Well, yes. As it says in the module ddoc, it's my version of C++'s std::vector. If you don't want it to copy, pass by ref.
 So if you try to treat it like a range, you'll take a 
 performance hit every time you trigger the postblit, which 
 seems likely.
You can pass it by ref like I mention above, or you could slice it and pass the slice instead. Without DIP1000 passing the slice wouldn't be safe but with it, it is. Again, just like std::vector, which is pretty much always passed by const ref. But safer and with a default allocator.
 Isn't it reasonable to assume a range should act like a 
 reference to data, rather than a copy of it?
It depends.
This container seems good for short-lived usage. If I wanted to pass a bunch of filenames to a function for processing, I would consider this container Ok, because it would allocate once, then be consumed as a range without having to copy anything. For any kind of persistent storage though, I would consider the performance costs of using this container's range functionality too high. At this point, I'm leaning toward just building separate containers. One would be a vector-like container that's nogc and only has an indexer, but does not support iterators/ranges. The second would be a class-based container which *would* have range/iterator support since the ranges/iterators could safely contain pointers back to the GC-allocated collection for range checks and such. I do think the idea of making the container itself the range is interesting, but what ruins it for me is the fact that ranges are consumable. So I'm wondering if it would work well to use an integer index for the range implementation instead of actually depleting the elements as you iterate, and have a "reset" function to refill the range so that no copying is necessary. I wonder how useful it would be to have some sort of NonConsumableRange concept that included reset() function. Bit
Mar 11 2019
parent reply Atila Neves <atila.neves gmail.com> writes:
On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 [...]
Like `const`, `scope` on a member function means that the `this` reference is `scope`.
Interesting. I'll have to look up some examples.
 [...]
https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145 static assert(isInputRange!(Vector!int));
 [...]
I would think so, especially since I wrote Vector to be safe with dip1000. Look at this unit test for not being able to escape the payload for example: https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
Ok, but this container copies the entire collection in postblit: https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104
Well, yes. As it says in the module ddoc, it's my version of C++'s std::vector. If you don't want it to copy, pass by ref.
 So if you try to treat it like a range, you'll take a 
 performance hit every time you trigger the postblit, which 
 seems likely.
You can pass it by ref like I mention above, or you could slice it and pass the slice instead. Without DIP1000 passing the slice wouldn't be safe but with it, it is. Again, just like std::vector, which is pretty much always passed by const ref. But safer and with a default allocator.
 Isn't it reasonable to assume a range should act like a 
 reference to data, rather than a copy of it?
It depends.
This container seems good for short-lived usage.
Why?
 If I wanted to pass a bunch of filenames to a function for 
 processing, I would consider this container Ok, because it 
 would allocate once, then be consumed as a range without having 
 to copy anything.
Or you don't consume it all and slice it (as I mentioned in my last post): safe unittest { scope v = vector(1, 2, 3); static void fun(R)(R range) { import std.array: array; assert(range.array == [1, 2, 3]); } fun(v[]); // wasn't consumed assert(v[] == [1, 2, 3]); } Again, it's safe because the slice is scoped.
 For any kind of persistent storage though, I would consider the 
 performance costs of using this container's range functionality 
 too high.
There aren't any performance costs. You pass a fat pointer just like you would with a GC array.
 I do think the idea of making the container itself the range is 
 interesting, but what ruins it for me is the fact that ranges 
 are consumable.
Which is why Vector has both opSlice and copies by default. It's only a range itself if the element type is mutable, otherwise it doesn't even define popFront.
 So I'm wondering if it would work well to use an integer index 
 for the range implementation
That's what slices are.
Mar 11 2019
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 11, 2019 at 06:29:23PM +0000, Atila Neves via Digitalmars-d wrote:
 On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
[...]
 I do think the idea of making the container itself the range is
 interesting, but what ruins it for me is the fact that ranges are
 consumable.
Which is why Vector has both opSlice and copies by default. It's only a range itself if the element type is mutable, otherwise it doesn't even define popFront.
[...] Generally, it's a bad idea to conflate a container with a range over the container, precisely for this reason. In this respect, built-in arrays are a bad example, because we think of them as containers, yet they are at the same time ranges. (Though strictly speaking, "arrays" are just slices of the underlying containers that are managed by druntime and not really directly manipulatable. But nobody thinks of them in this way; the concept of "array" is much easier to grasp and reason with than "slice over druntime-managed container".) The recommended approach is to have the container be distinct from a range over its elements, and provide such a range via a member function, commonly chosen to be opSlice. T -- Real men don't take backups. They put their source on a public FTP-server and let the world mirror it. -- Linus Torvalds
Mar 11 2019
parent Atila Neves <atila.neves gmail.com> writes:
On Monday, 11 March 2019 at 19:23:19 UTC, H. S. Teoh wrote:
 On Mon, Mar 11, 2019 at 06:29:23PM +0000, Atila Neves via 
 Digitalmars-d wrote:
 [...]
[...]
 [...]
[...] Generally, it's a bad idea to conflate a container with a range over the container, precisely for this reason. In this respect, built-in arrays are a bad example, because we think of them as containers, yet they are at the same time ranges. (Though strictly speaking, "arrays" are just slices of the underlying containers that are managed by druntime and not really directly manipulatable. But nobody thinks of them in this way; the concept of "array" is much easier to grasp and reason with than "slice over druntime-managed container".) [...]
Maybe so. I'll have to think about it and maybe change the API so it's no longer a range.
Mar 11 2019
prev sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Monday, 11 March 2019 at 18:29:23 UTC, Atila Neves wrote:
 On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 [...]
Like `const`, `scope` on a member function means that the `this` reference is `scope`.
Interesting. I'll have to look up some examples.
 [...]
https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L134 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L159 https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L145 static assert(isInputRange!(Vector!int));
 [...]
I would think so, especially since I wrote Vector to be safe with dip1000. Look at this unit test for not being able to escape the payload for example: https://github.com/atilaneves/automem/blob/master/tests/ut/vector.d#L229
Ok, but this container copies the entire collection in postblit: https://github.com/atilaneves/automem/blob/2a97acba94d6fe0bf9ba07fec99e86e46aa0f2a1/source/automem/vector.d#L104
Well, yes. As it says in the module ddoc, it's my version of C++'s std::vector. If you don't want it to copy, pass by ref.
 So if you try to treat it like a range, you'll take a 
 performance hit every time you trigger the postblit, which 
 seems likely.
You can pass it by ref like I mention above, or you could slice it and pass the slice instead. Without DIP1000 passing the slice wouldn't be safe but with it, it is. Again, just like std::vector, which is pretty much always passed by const ref. But safer and with a default allocator.
 Isn't it reasonable to assume a range should act like a 
 reference to data, rather than a copy of it?
It depends.
This container seems good for short-lived usage.
Why?
 If I wanted to pass a bunch of filenames to a function for 
 processing, I would consider this container Ok, because it 
 would allocate once, then be consumed as a range without 
 having to copy anything.
Or you don't consume it all and slice it (as I mentioned in my last post): safe unittest { scope v = vector(1, 2, 3); static void fun(R)(R range) { import std.array: array; assert(range.array == [1, 2, 3]); } fun(v[]); // wasn't consumed assert(v[] == [1, 2, 3]); } Again, it's safe because the slice is scoped.
 For any kind of persistent storage though, I would consider 
 the performance costs of using this container's range 
 functionality too high.
There aren't any performance costs. You pass a fat pointer just like you would with a GC array.
 I do think the idea of making the container itself the range 
 is interesting, but what ruins it for me is the fact that 
 ranges are consumable.
Which is why Vector has both opSlice and copies by default. It's only a range itself if the element type is mutable, otherwise it doesn't even define popFront.
 So I'm wondering if it would work well to use an integer index 
 for the range implementation
That's what slices are.
At first, I missed the use of scope here:
 auto opSlice(this This)() scope return
My reasoning was that using automem/vector for persistent storage required it to be iterated over, which couldn't be done without consuming the container, or copying it. I see now that you can iterate by returning a scoped slice of the internal data store, but this could be unsafe too. If you called reserve on the container, the returned slice could end up dangling if a non-gc allocator was used, even if the returned slice was scoped. I think that if opSlice returned a scoped range with a pointer back to the original container, then it would be easy for the range object to detect the state of the container, even including if it had been moved from and left in it's initial state. As long as the range couldn't leave the stack-frame of the container, I think this would be totally safe.
Mar 11 2019
parent reply Atila Neves <atila.neves gmail.com> writes:
On Monday, 11 March 2019 at 19:40:06 UTC, bitwise wrote:
 On Monday, 11 March 2019 at 18:29:23 UTC, Atila Neves wrote:
 On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 [...]
<snip>
 I see now that you can iterate by returning a scoped slice of 
 the internal data store, but this could be unsafe too. If you 
 called reserve on the container, the returned slice could end 
 up dangling if a non-gc allocator was used, even if the 
 returned slice was scoped.
Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
 I think that if opSlice returned a scoped range with a pointer 
 back to the original container,
It does.
 As long as the range couldn't leave the stack-frame of the 
 container, I think this would be totally safe.
It can't; that's the whole point of dip1000. Again, that's why this test passes: https://github.com/atilaneves/automem/blob/e34b2b0c1510efd91064a9cbf83cbf43856c1a5c/tests/ut/vector.d#L229
Mar 11 2019
next sibling parent bitwise <bitwise.pvt gmail.com> writes:
On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
 On Monday, 11 March 2019 at 19:40:06 UTC, bitwise wrote:
 On Monday, 11 March 2019 at 18:29:23 UTC, Atila Neves wrote:
 On Monday, 11 March 2019 at 17:00:35 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 18:46:14 UTC, Atila Neves wrote:
 On Sunday, 10 March 2019 at 17:36:09 UTC, bitwise wrote:
 On Sunday, 10 March 2019 at 16:10:10 UTC, Atila Neves 
 wrote:
 On Sunday, 10 March 2019 at 15:12:03 UTC, bitwise wrote:
 [...]
<snip>
 I see now that you can iterate by returning a scoped slice of 
 the internal data store, but this could be unsafe too. If you 
 called reserve on the container, the returned slice could end 
 up dangling if a non-gc allocator was used, even if the 
 returned slice was scoped.
Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
So if I use opSlice() to retrieve a slice of the internal array, then call reserve on the vector causing it to deallocate, then try to access the slice I just retrieved, what would you call that?
 I think that if opSlice returned a scoped range with a pointer 
 back to the original container,
It does.
https://github.com/atilaneves/automem/blob/e34b2b0c1510efd91064a9cbf83cbf43856c1a5c/source/automem/vector.d#L284 I mean a custom range object that contains a pointer back to the original container. This code returns a pointer directly to memory inside the container. It's not wrapped in a range object, and it doesn't have any pointer back to the vector, which it would need because the internal memory of the vector could be deallocated, but the actual vector object itself could not (relative to the scoped range). At most, the container could be moved from and considered empty by the range it returned.
 As long as the range couldn't leave the stack-frame of the 
 container, I think this would be totally safe.
It can't; that's the whole point of dip1000. Again, that's why this test passes: https://github.com/atilaneves/automem/blob/e34b2b0c1510efd91064a9cbf83cbf43856c1a5c/tests/ut/vector.d#L229
I was referring to my suggested implementation of a range, not the slice returned by automem/vector. automem/vector's opSlice() could potentially allow you to dereference an invalid pointer. I don't know if detecting that is within the scope of dip1000, but I would still consider it a hole in the implementation.
Mar 11 2019
prev sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
 <snip>
 I see now that you can iterate by returning a scoped slice of 
 the internal data store, but this could be unsafe too. If you 
 called reserve on the container, the returned slice could end 
 up dangling if a non-gc allocator was used, even if the 
 returned slice was scoped.
Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
I think that this is what bitwise was talking about: https://github.com/atilaneves/automem/issues/25
Mar 12 2019
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Tuesday, 12 March 2019 at 18:06:39 UTC, Petar Kirov 
[ZombineDev] wrote:
 On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
 <snip>
 I see now that you can iterate by returning a scoped slice of 
 the internal data store, but this could be unsafe too. If you 
 called reserve on the container, the returned slice could end 
 up dangling if a non-gc allocator was used, even if the 
 returned slice was scoped.
Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
I think that this is what bitwise was talking about: https://github.com/atilaneves/automem/issues/25
I hadn't seen this, but yes, thanks. Bit
Mar 12 2019
parent reply Atila Neves <atila.neves gmail.com> writes:
On Tuesday, 12 March 2019 at 21:06:17 UTC, bitwise wrote:
 On Tuesday, 12 March 2019 at 18:06:39 UTC, Petar Kirov 
 [ZombineDev] wrote:
 On Monday, 11 March 2019 at 20:02:05 UTC, Atila Neves wrote:
 <snip>
 I see now that you can iterate by returning a scoped slice 
 of the internal data store, but this could be unsafe too. If 
 you called reserve on the container, the returned slice 
 could end up dangling if a non-gc allocator was used, even 
 if the returned slice was scoped.
Unless there's a bug in my implementation or in dip1000, no, it can't be unsafe.
I think that this is what bitwise was talking about: https://github.com/atilaneves/automem/issues/25
I hadn't seen this, but yes, thanks. Bit
After thinking about what bitwise said, I was actually going to post back similar code. The code in the github issue is problematic, but it's a far cry from how such errors are usually introduced in C++. Normally it's because another thread has a reference to the now dangling pointer, or calling a function with the slice means it gets escaped somewhere. Another way would be to write this: void oops(ref Vector!int vec, int[] slice); Which is also convoluted. Neither case would pass code review. Either way, thanks for pointing out the possible issue. I'm going to have to think long and hard about whether it's possible to fix it with the language we have now.
Mar 13 2019
next sibling parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Wednesday, 13 March 2019 at 08:49:37 UTC, Atila Neves wrote:
 Either way, thanks for pointing out the possible issue. I'm 
 going to have to think long and hard about whether it's 
 possible to fix it with the language we have now.
One possibility might be to implement at run time what rust does at compile time, eg make it impossible to mutate vector as long as one or more readable view to its content exists. So for instance, make opSlice and opIndex return a struct that bumps a reference counter up inside the vector; the counter is bumped down once that struct is destructed. As long as the number of "live" slices is non-zero, the vector isn't allowed to re-allocate.
 The code in the github issue is problematic, but it's a far cry 
 from how such errors are usually introduced in C++. Normally 
 it's because another thread has a reference to the now dangling 
 pointer, or calling a function with the slice means it gets 
 escaped somewhere.
 Which is also convoluted. Neither case would pass code review.
I don't think that's the point. It's supposed to be *impossible* to get a memory corruption in safe code. Not "convoluted and wouldn't pass code review", impossible. When you write a trusted function, getting a memory corruption with that function is supposed to be impossible as well. Other people have gone over why memory safety is necessary before. Not all code goes through code review, sometimes review miss errors even for critical applications, etc.
Mar 13 2019
next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 13 March 2019 at 11:35:02 UTC, Olivier FAURE wrote:
 On Wednesday, 13 March 2019 at 08:49:37 UTC, Atila Neves wrote:
 [...]
One possibility might be to implement at run time what rust does at compile time, eg make it impossible to mutate vector as long as one or more readable view to its content exists. So for instance, make opSlice and opIndex return a struct that bumps a reference counter up inside the vector; the counter is bumped down once that struct is destructed. As long as the number of "live" slices is non-zero, the vector isn't allowed to re-allocate.
 [...]
 [...]
I don't think that's the point. It's supposed to be *impossible* to get a memory corruption in safe code. Not "convoluted and wouldn't pass code review", impossible. When you write a trusted function, getting a memory corruption with that function is supposed to be impossible as well. Other people have gone over why memory safety is necessary before. Not all code goes through code review, sometimes review miss errors even for critical applications, etc.
I just tagged v0.6.0. I made `opSlice` ` system` given what has been discussed here. I introduced a new member function `range` which return a forward range that keeps a pointer to the vector inside of it. Coupled with dip1000 it seems to do the trick. I also made it so `Vector` is no longer an input range itself.
Mar 13 2019
next sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Wednesday, 13 March 2019 at 13:29:51 UTC, Atila Neves wrote:
 [...]

 I introduced a new member function `range` which return a 
 forward range that keeps a pointer to the vector inside of it. 
 Coupled with dip1000 it seems to do the trick.
Awesome. I'm trying to test the same thing, but -dip1000 seems to have no effect, and -scope is unrecognized. Do I have to download the beta or GIT HEAD to get a functional DIP1000 implementation? Thanks again, Bit
Mar 15 2019
parent Atila Neves <atila.neves gmail.com> writes:
On Friday, 15 March 2019 at 16:11:43 UTC, bitwise wrote:
 On Wednesday, 13 March 2019 at 13:29:51 UTC, Atila Neves wrote:
 [...]

 I introduced a new member function `range` which return a 
 forward range that keeps a pointer to the vector inside of it. 
 Coupled with dip1000 it seems to do the trick.
Awesome. I'm trying to test the same thing, but -dip1000 seems to have no effect, and -scope is unrecognized. Do I have to download the beta or GIT HEAD to get a functional DIP1000 implementation? Thanks again, Bit
It definitely has an effect. This code compiles with -dip1000, doesn't without: void main() safe { int i; fun(&i); } void fun(scope int* i) safe { } Works as intended on dmd 2.085.0 and 2.084.1.
Mar 18 2019
prev sibling parent reply Olivier FAURE <couteaubleu gmail.com> writes:
On Wednesday, 13 March 2019 at 13:29:51 UTC, Atila Neves wrote:
 I made `opSlice` ` system` given what has been discussed here.

 I introduced a new member function `range` which return a 
 forward range that keeps a pointer to the vector inside of it. 
 Coupled with dip1000 it seems to do the trick.
Have you tested it with opIndex? You'd probably get the same problem, since it returns a reference to data that might be reallocated later.
Mar 16 2019
parent Atila Neves <atila.neves gmail.com> writes:
On Saturday, 16 March 2019 at 12:17:20 UTC, Olivier FAURE wrote:
 On Wednesday, 13 March 2019 at 13:29:51 UTC, Atila Neves wrote:
 I made `opSlice` ` system` given what has been discussed here.

 I introduced a new member function `range` which return a 
 forward range that keeps a pointer to the vector inside of it. 
 Coupled with dip1000 it seems to do the trick.
Have you tested it with opIndex? You'd probably get the same problem, since it returns a reference to data that might be reallocated later.
Ugh, you're right. Not sure what to do about that one, but thanks for bringing it up!
Mar 18 2019
prev sibling parent bitwise <bitwise.pvt gmail.com> writes:
On Wednesday, 13 March 2019 at 11:35:02 UTC, Olivier FAURE wrote:
 One possibility might be to implement at run time what rust 
 does at compile time, eg make it impossible to mutate vector as 
 long as one or more readable view to its content exists.

 So for instance, make opSlice and opIndex return a struct that 
 bumps a reference counter up inside the vector; the counter is 
 bumped down once that struct is destructed. As long as the 
 number of "live" slices is non-zero, the vector isn't allowed 
 to re-allocate.
I'm not 100% sure, but it kinda seeems like this solution requires, but is obviated by scope.
 It's supposed to be *impossible* to get a memory corruption in 
  safe code. Not "convoluted and wouldn't pass code review", 
 impossible. When you write a  trusted function, getting a 
 memory corruption with that function is supposed to be 
 impossible as well.
I agree that safety should be a strict guarantee.
Mar 15 2019
prev sibling parent bitwise <bitwise.pvt gmail.com> writes:
On Wednesday, 13 March 2019 at 08:49:37 UTC, Atila Neves wrote:
 On Tuesday, 12 March 2019 at 21:06:17 UTC, bitwise wrote:
 [...]
    Bit
After thinking about what bitwise said, I was actually going to post back similar code. The code in the github issue is problematic, but it's a far cry from how such errors are usually introduced in C++. Normally it's because another thread has a reference to the now dangling pointer, or calling a function with the slice means it gets escaped somewhere.
I think that in C++, iterator validity is a hard concept to overlook. I would bet that even relatively new C++ programmers are familiar with the idea. I think most intro C++ classes teach this type of thing relatively early on. This awareness prevents/mitigates a lot of potential problems, resulting in the issue being underrepresented in peoples negative views of C++ (or so it would seem, I've seen no actual data on this). In any case, it would be great if callers could just ignore the issue completely.
 void oops(ref Vector!int vec, int[] slice);
Yeah, I remember Walter's post about how this will mess with ref-counted objects as well. IIRC, the solution was compiler support double-incrementing the ref-count when it finds this case. I wonder if this is actually necessary though, because in C++, shared_ptr has the aliasing constructor, and with D's templates, something like RefCounted could be modified to expose the members of it's contained type as properties that worked like shared_ptr's aliasing constructor, wrapping the returned field value in a ref-counted object that shares the ref-count of its parent object. If this was done, the member exposing that slice would do the extra increment without compiler support.
 Either way, thanks for pointing out the possible issue. I'm 
 going to have to think long and hard about whether it's 
 possible to fix it with the language we have now.
Thanks for exploring this issue with me. I agree its tough to build a proper container right now, but I do believe I have more options than when I started. I just found this as well, which I'm pretty excited about: https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1014.md The version of std::vector from Visual Studio makes use of this for iterator validation in debug mode. The vector heap-allocates a "proxy" object and stores it's own pointer in it. When the vector moves (and it's move ctor is called), it updates the pointer in the proxy object. It also gives this proxy out to iterators so they can track the original vector. Without any type of move callback in D, such an implementation would have been impossible, so this is awesome news.
Mar 15 2019