www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - container stuff

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I've uploaded a work in progress on the container design here:

http://erdani.com/d/phobos/std_container.html

It's deceptively simple - the entire design is a nomenclature, really. 
Any given container may choose to implement whichever primitives it can, 
if (and only if) it can satisfy the requirements. Beyond that, each 
container can define its own primitives.

The design is not fancy, which doesn't worry me in the least because I 
was aiming for the right design, not a fancy design. I feel this design 
is pretty close to what I really wanted.

The major players are the containers themselves and the ranges they 
define. Most operations are defined in terms of ranges, not containers. 
The containers only need to support a modicum of primitives that affect 
the topology of containers, plus a few convenience functions.

There are a bunch of "soft" primitives. Those are meant to put a stop to 
the iterator invalidation problems experienced in the STL. The container 
implementor may alias softXyz to xyz if she knows the operation won't 
mess the ranges currently iterating the container (which is the case for 
most node-based containers). I haven't yet discussed subtler cases in 
which a range starts with a removed element etc., but I plan to.

So, this is it really: the containers specification is a collection of 
capabilities. A given container picks the ones it can support 
meaningfully, and user code has the specification as semantic and 
complexity guarantees.

This design is quite far removed from Steve's, which makes the 
integration with dcollection tenuous. But at a minimum, maybe we can 
work out something in which Steve offers, with credit, implementation 
for this design and also offers dcollections as a separate library.


Andrei
May 25 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 25 May 2010 18:27:32 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I've uploaded a work in progress on the container design here:

 http://erdani.com/d/phobos/std_container.html

 It's deceptively simple - the entire design is a nomenclature, really.  
 Any given container may choose to implement whichever primitives it can,  
 if (and only if) it can satisfy the requirements. Beyond that, each  
 container can define its own primitives.

 The design is not fancy, which doesn't worry me in the least because I  
 was aiming for the right design, not a fancy design. I feel this design  
 is pretty close to what I really wanted.

 The major players are the containers themselves and the ranges they  
 define. Most operations are defined in terms of ranges, not containers.  
 The containers only need to support a modicum of primitives that affect  
 the topology of containers, plus a few convenience functions.

 There are a bunch of "soft" primitives. Those are meant to put a stop to  
 the iterator invalidation problems experienced in the STL. The container  
 implementor may alias softXyz to xyz if she knows the operation won't  
 mess the ranges currently iterating the container (which is the case for  
 most node-based containers). I haven't yet discussed subtler cases in  
 which a range starts with a removed element etc., but I plan to.

 So, this is it really: the containers specification is a collection of  
 capabilities. A given container picks the ones it can support  
 meaningfully, and user code has the specification as semantic and  
 complexity guarantees.

 This design is quite far removed from Steve's, which makes the  
 integration with dcollection tenuous. But at a minimum, maybe we can  
 work out something in which Steve offers, with credit, implementation  
 for this design and also offers dcollections as a separate library.

I take it from the comment in the docs that cursors can be an auxiliary range? Is there any reason not to define a mandatory cursor type (a cursor is elementary to define if a range is definable)? There is no remove(Range) function, this is a bad omission. It's one of the main reasons I created dcollections (quick element removal when element lookup is not quick). I don't like insertInstead, can we make it replace? What about the purge function of dcollections (i.e. removal while iterating)? What about opApply? Without opApply, you cannot use foreach to get both keys and values. I can probably submit my basic implementations, and use them from std.x to implement dcollections. This way, the complex pieces are shared. Dcollections definitely fills needs that this collection package doesn't, and it should be mostly compatible. -Steve
May 25 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 06:04 PM, Steven Schveighoffer wrote:
 On Tue, 25 May 2010 18:27:32 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I've uploaded a work in progress on the container design here:

 http://erdani.com/d/phobos/std_container.html

 It's deceptively simple - the entire design is a nomenclature, really.
 Any given container may choose to implement whichever primitives it
 can, if (and only if) it can satisfy the requirements. Beyond that,
 each container can define its own primitives.

 The design is not fancy, which doesn't worry me in the least because I
 was aiming for the right design, not a fancy design. I feel this
 design is pretty close to what I really wanted.

 The major players are the containers themselves and the ranges they
 define. Most operations are defined in terms of ranges, not
 containers. The containers only need to support a modicum of
 primitives that affect the topology of containers, plus a few
 convenience functions.

 There are a bunch of "soft" primitives. Those are meant to put a stop
 to the iterator invalidation problems experienced in the STL. The
 container implementor may alias softXyz to xyz if she knows the
 operation won't mess the ranges currently iterating the container
 (which is the case for most node-based containers). I haven't yet
 discussed subtler cases in which a range starts with a removed element
 etc., but I plan to.

 So, this is it really: the containers specification is a collection of
 capabilities. A given container picks the ones it can support
 meaningfully, and user code has the specification as semantic and
 complexity guarantees.

 This design is quite far removed from Steve's, which makes the
 integration with dcollection tenuous. But at a minimum, maybe we can
 work out something in which Steve offers, with credit, implementation
 for this design and also offers dcollections as a separate library.

I take it from the comment in the docs that cursors can be an auxiliary range? Is there any reason not to define a mandatory cursor type (a cursor is elementary to define if a range is definable)?

Yes, but the cursor would have to pass scrutiny for usefulness just like anything else.
 There is no remove(Range) function, this is a bad omission. It's one of
 the main reasons I created dcollections (quick element removal when
 element lookup is not quick).

I think it got lost when I reordered the code (I did a major reshuffling just before posting). I put it back.
 I don't like insertInstead, can we make it replace?

replace was the original name. insertInstead is consistent with the other two, so we have (softI|i)nsert[Before|After|Instead].
 What about the purge function of dcollections (i.e. removal while
 iterating)?

Removal while iterating is achieved through different means. I need to think through the details a bit more, but in essence it doesn't have to be a primitive. The primitives should allow implementation of a generic function that removes elements that satisfy a condition, something as follows: If the collection supports softRemove, if you want to remove an element while iterating with a range r, you can say coll.softRemove(r.take(1)). If it doesn't, I'm thinking of allowing the erase/remove idiom (http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Erase-Remove). For that, remove should return a range positioned right after the removed element. Let me think a bit more about that.
 What about opApply? Without opApply, you cannot use foreach to get both
 keys and values.

I'd like to design without opApply as part of the primitives. You can use foreach to iterate tuples of keys and values.
 I can probably submit my basic implementations, and use them from std.x
 to implement dcollections. This way, the complex pieces are shared.
 Dcollections definitely fills needs that this collection package
 doesn't, and it should be mostly compatible.

Sounds promising! Andrei
May 25 2010
parent reply Jerry Quinn <jlquinn optonline.net> writes:
Andrei Alexandrescu Wrote:

 On 05/25/2010 06:04 PM, Steven Schveighoffer wrote:
 On Tue, 25 May 2010 18:27:32 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I've uploaded a work in progress on the container design here:

 http://erdani.com/d/phobos/std_container.html

 It's deceptively simple - the entire design is a nomenclature, really.
 Any given container may choose to implement whichever primitives it
 can, if (and only if) it can satisfy the requirements. Beyond that,
 each container can define its own primitives.

 There are a bunch of "soft" primitives. Those are meant to put a stop
 to the iterator invalidation problems experienced in the STL. The
 container implementor may alias softXyz to xyz if she knows the
 operation won't mess the ranges currently iterating the container
 (which is the case for most node-based containers). I haven't yet
 discussed subtler cases in which a range starts with a removed element
 etc., but I plan to.



softXXX might be better named safeXXX or stableXXX. The names might be more suggestive of preserving iteration. The doc isn't quite clear whether make is a member function or standalone. I assume it's standalone, but that's worth firming up.
 I don't like insertInstead, can we make it replace?

replace was the original name. insertInstead is consistent with the other two, so we have (softI|i)nsert[Before|After|Instead].

I second the request for the name "replace". Despite the consistency, I think replace is clear to programmers as well as being more familiar and comfortable. Also, the operation really isn't "insert instead", it's "delete then insert instead". So there is lack of symmetry because it removes elements while the other does not. Another issue I see is that opIndex and its brethren takes KeyType, but for something like vector, this should be size_t.. I don't think you want ElementType to be tuple!(size_t, ElementType) in this case. Related to this, do you intend removal of a single element to use remove(range) or removeKey? Finally, opSlice(begin, end) is not there. Was this intentional or overlooked? Jerry
May 25 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 12:40 AM, Jerry Quinn wrote:
 softXXX might be better named safeXXX or stableXXX.  The names might
 be more suggestive of preserving iteration.

Done. "stable". Thanks.
 The doc isn't quite clear whether make is a member function or
 standalone.  I assume it's standalone, but that's worth firming up.

make is standalone, the indentation should give that away.
 I don't like insertInstead, can we make it replace?

replace was the original name. insertInstead is consistent with the other two, so we have (softI|i)nsert[Before|After|Instead].

I second the request for the name "replace". Despite the consistency, I think replace is clear to programmers as well as being more familiar and comfortable. Also, the operation really isn't "insert instead", it's "delete then insert instead". So there is lack of symmetry because it removes elements while the other does not.

Good point. I inserted "replace" instead of "insertInstead" (and just inserted a pun, too).
 Another  issue I see is that opIndex and its brethren takes KeyType,
 but for something like vector, this should be size_t..  I don't think
 you want ElementType to be tuple!(size_t, ElementType) in this case.

Arrays don't define KeyType, yet they define opIndex. That doesn't go against the spec. That being said, I'd like to see a bit more unification between arrays and associative arrays.
 Related to this, do you intend removal of a single element to use
 remove(range) or removeKey?

I think remove(range.take(1)) should remove one element. I'm also thinking of simplifying matters by defining remove(range, k) where k is an "up to" number.
 Finally, opSlice(begin, end) is not there.  Was this intentional or
 overlooked?

I'm still mulling over that. As was discussed in this group, $ is easy to detect statically but 0 is not. Some containers don't like nonzero beginning anchors, and I wouldn't want to make that a runtime test. If Walter won't make a language change, I'd have to define begin() as an anchor, and before you know it, cursors are in :o). To recap: a) arrays (but not associative arrays) can define c[a .. b] for integrals a and b. b) associative arrays can define c[a .. b] for a, b key types. It's nontrivial but it can be done. c) sentinel-terminated arrays (e.g. C stringz) can only define c[a .. $] for an integral a. d) lists can, with ho and hum, define c[0 .. a] for an integral a. The ho and hum comes from the fact that the range thus obtained is not a "proper" Range type, it's a Take!Range. e) Any other cases? Andrei
May 26 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Andrei Alexandrescu Wrote:


 Finally, opSlice(begin, end) is not there.  Was this intentional or
 overlooked?

I'm still mulling over that. As was discussed in this group, $ is easy to detect statically but 0 is not. Some containers don't like nonzero beginning anchors, and I wouldn't want to make that a runtime test.

It would most likely not violate an O(lg(n)) requirement to check that the beginning anchor was indeed the beginning. In fact, it would be about as much work as opSlice(), since you have to find that beginning anchor anyways...
 If 
 Walter won't make a language change, I'd have to define begin() as an 
 anchor, and before you know it, cursors are in :o).

I'm all for a language change and cursors ;)
 b) associative arrays can define c[a .. b] for a, b key types. It's 
 nontrivial but it can be done.

I decided for dcollections that this only makes sense on sorted maps. It is a little questionable to check in runtime that b > a on something like a hash map, but besides that point, it's non deterministic. Slicing with keys on c[a..b] might work before a rehash, and not work after one. This latter point is what made me disallow it.
 
 c) sentinel-terminated arrays (e.g. C stringz) can only define c[a .. $] 
 for an integral a.

I really hope we are not considering null-terminated strings when deciding container functions...
 e) Any other cases?

On a sorted AA, slicing using any two keys are possible (as long as the keys exist in the AA). -Steve
May 26 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 04:47 PM, Steven Schveighoffer wrote:
 b) associative arrays can define c[a .. b] for a, b key types.
 It's nontrivial but it can be done.

I decided for dcollections that this only makes sense on sorted maps.

Yah, I meant sorted collections as well.
 c) sentinel-terminated arrays (e.g. C stringz) can only define c[a
 .. $] for an integral a.

I really hope we are not considering null-terminated strings when deciding container functions...

I don't know. When working on an abstraction I find it very, very useful to anchor it in concrete incarnations that I know of. (Such an approach has led to taming the gnarly UTF-8 strings into nice bidirectional ranges.) I'm not crazy about null-terminated strings, but slists are also sentinel terminated, they just aren't contiguous. Comparing and contrasting these and other concrete instantiations is great exercise that only makes the abstraction better.
 e) Any other cases?

On a sorted AA, slicing using any two keys are possible (as long as the keys exist in the AA).

Great. Andrei
May 26 2010
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 06:04 PM, Steven Schveighoffer wrote:
 What about the purge function of dcollections (i.e. removal while
 iterating)?

I changed remove and softRemove to return a range positioned to after the removed stuff. Andrei
May 25 2010
prev sibling next sibling parent reply BLS <windevguy hotmail.de> writes:
On 26/05/2010 01:04, Steven Schveighoffer wrote:
 I can probably submit my basic implementations, and use them from std.x
 to implement dcollections.  This way, the complex pieces are shared.
 Dcollections definitely fills needs that this collection package
 doesn't, and it should be mostly compatible.

 -Steve

Not that I like the idea of having (once again) two libs instead of one, but I am convinced that the separation between core data- structures, say xxTree, xxList, xxNode etc. and the final implementation f.i. Set, Dictionary/Map. is a very good thing. std.structures std.algorithm std.container/collection -- Next, what's wrong with simple collections, stack, queue etc. Guess too simple for u guys ;) -- I regret a bit that you haven't picked up the idea of collection events. IMHO this is the smartest way to implement a UnDo/ReDo Stack or to create a MultiSet based on a simple Set. Bjoern
May 26 2010
next sibling parent reply BLS <windevguy hotmail.de> writes:
On 26/05/2010 17:31, BLS wrote:
 regret a bit that you haven't picked up the idea of collection events.
 IMHO this is the smartest way to implement a UnDo/ReDo Stack or to
 create a MultiSet based on a simple Set.

and since Dr. Dobbs is probably a more serious source. http://www.drdobbs.com/windows/199902700 and a concrete implementation on page : 5 http://www.drdobbs.com/windows/199902700;jsessionid=A10YNSK2DXVJ3QE1GHOSKH4ATMY32JVN?pgno=5 hth, bjoern
May 26 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 11:08 AM, BLS wrote:
 On 26/05/2010 17:31, BLS wrote:
 regret a bit that you haven't picked up the idea of collection events.
 IMHO this is the smartest way to implement a UnDo/ReDo Stack or to
 create a MultiSet based on a simple Set.

and since Dr. Dobbs is probably a more serious source. http://www.drdobbs.com/windows/199902700 and a concrete implementation on page : 5 http://www.drdobbs.com/windows/199902700;jsessionid=A10YNSK2DXVJ3QE1GHOSKH4ATMY32JVN?pgno=5

Whoa, that does look like a huge inheritance diagram. http://www.drdobbs.com/windows/199902700;jsessionid=CQ3Z0PZ33GJCRQE1GHOSKHWATMY32JVN?pgno=2 I swear I saw a kitchen sink in there somewhere. Andrei
May 26 2010
parent BLS <windevguy hotmail.de> writes:
On 26/05/2010 21:53, Andrei Alexandrescu wrote:
 On 05/26/2010 11:08 AM, BLS wrote:
 On 26/05/2010 17:31, BLS wrote:
 regret a bit that you haven't picked up the idea of collection events.
 IMHO this is the smartest way to implement a UnDo/ReDo Stack or to
 create a MultiSet based on a simple Set.

and since Dr. Dobbs is probably a more serious source. http://www.drdobbs.com/windows/199902700 and a concrete implementation on page : 5 http://www.drdobbs.com/windows/199902700;jsessionid=A10YNSK2DXVJ3QE1GHOSKH4ATMY32JVN?pgno=5

Whoa, that does look like a huge inheritance diagram. http://www.drdobbs.com/windows/199902700;jsessionid=CQ3Z0PZ33GJCRQE1GHOSKHWATMY32JVN?pgno=2 I swear I saw a kitchen sink in there somewhere.

Yeah, I was afraid that exactly this happened...giving a link.. you guys are watching the complete collection lib.. and nobody cares about a simple,smart detail named collection update events. beside, doable without creating an inheritance monster ;) bls
 Andrei

May 26 2010
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2010-05-26 17.31, BLS wrote:
 On 26/05/2010 01:04, Steven Schveighoffer wrote:
 I can probably submit my basic implementations, and use them from std.x
 to implement dcollections. This way, the complex pieces are shared.
 Dcollections definitely fills needs that this collection package
 doesn't, and it should be mostly compatible.

 -Steve

Not that I like the idea of having (once again) two libs instead of one, but I am convinced that the separation between core data- structures, say xxTree, xxList, xxNode etc. and the final implementation f.i. Set, Dictionary/Map. is a very good thing. std.structures std.algorithm std.container/collection -- Next, what's wrong with simple collections, stack, queue etc. Guess too simple for u guys ;) -- I regret a bit that you haven't picked up the idea of collection events. IMHO this is the smartest way to implement a UnDo/ReDo Stack or to create a MultiSet based on a simple Set. Bjoern

Yes, events would be nice to have. Just simulating C# events or something similar. -- /Jacob Carlborg
May 26 2010
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
BLS Wrote:

 On 26/05/2010 01:04, Steven Schveighoffer wrote:
 I can probably submit my basic implementations, and use them from std.x
 to implement dcollections.  This way, the complex pieces are shared.
 Dcollections definitely fills needs that this collection package
 doesn't, and it should be mostly compatible.

 -Steve

Not that I like the idea of having (once again) two libs instead of one, but I am convinced that the separation between core data- structures, say xxTree, xxList, xxNode etc. and the final implementation f.i. Set, Dictionary/Map. is a very good thing.

I don't think it will be analogous to a Tango/Phobos separation. Dcollections supports ranges, and the major interface to Andrei's containers is ranges, and if my implementations are accepted, both will probably even be using the same underlying implementations. So I think the two libs will quite happily exist together. I am disappointed that dcollections wasn't chosen, but given the eventual API that Andrei has come up with, I think it didn't really have a shot from the beginning.
 I regret a bit that you haven't picked up the idea of collection events.
 IMHO this is the smartest way to implement a UnDo/ReDo Stack or to 
 create a MultiSet based on a simple Set.

I don't know much about it. If it makes sense to be used in dcollections, I might do it. Right now, however, I want to concentrate on getting dcollections out of beta. -Steve
May 26 2010
next sibling parent reply BLS <windevguy hotmail.de> writes:
On 26/05/2010 23:59, Steven Schveighoffer wrote:
 I don't think it will be analogous to a Tango/Phobos separation.  Dcollections
supports ranges, and the major interface to Andrei's containers is ranges, and
if my implementations are accepted, both will probably even be using the same
underlying implementations.  So I think the two libs will quite happily exist
together.  I am disappointed that dcollections wasn't chosen, but given the
eventual API that Andrei has come up with, I think it didn't really have a shot
from the beginning.

swapped) feature of underlaying data structures dCollections has support, std.collection... dunno In case that Andrei decides to ignore that feature I can't see how Dcollections and std.collection will harmonize. bjoern
May 26 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 09:47 PM, BLS wrote:
 On 26/05/2010 23:59, Steven Schveighoffer wrote:
 I don't think it will be analogous to a Tango/Phobos separation.
 Dcollections supports ranges, and the major interface to Andrei's
 containers is ranges, and if my implementations are accepted, both
 will probably even be using the same underlying implementations. So I
 think the two libs will quite happily exist together. I am
 disappointed that dcollections wasn't chosen, but given the eventual
 API that Andrei has come up with, I think it didn't really have a shot
 from the beginning.

swapped) feature of underlaying data structures dCollections has support, std.collection... dunno In case that Andrei decides to ignore that feature I can't see how Dcollections and std.collection will harmonize.

It's even better than that, but you need to look at things a bit differently. std.container will not contain hot-swappable components. It will contain components that could be used in some of the hot swaps. It's just one level lower than what you are discussing. That doesn't make it any more or less incompatible with dcollections. Andrei
May 26 2010
parent BLS <windevguy hotmail.de> writes:
On 27/05/2010 06:06, Andrei Alexandrescu wrote:
 std.container will not contain hot-swappable components. It will contain
 components that could be used in some of the hot swaps. It's just one
 level lower than what you are discussing.

 That doesn't make it any more or less incompatible with dcollections.


 Andrei

Thanks for taking the time to explain. Whatever direction D collections will take, I think we all agree that collections are the "missing link" in D. Once done, I am convinced that a number of D2 add on libs will grow drastically. Just can speak for myself, f.i. creating a dock /undock, tabbed MDI GUI without having a solid collection lib is not really a pleasure.. Bjoern
May 27 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 04:59 PM, Steven Schveighoffer wrote:
 BLS Wrote:

 On 26/05/2010 01:04, Steven Schveighoffer wrote:
 I can probably submit my basic implementations, and use them from
 std.x to implement dcollections.  This way, the complex pieces
 are shared. Dcollections definitely fills needs that this
 collection package doesn't, and it should be mostly compatible.

 -Steve

Not that I like the idea of having (once again) two libs instead of one, but I am convinced that the separation between core data- structures, say xxTree, xxList, xxNode etc. and the final implementation f.i. Set, Dictionary/Map. is a very good thing.

I don't think it will be analogous to a Tango/Phobos separation.

Same here.
 Dcollections supports ranges, and the major interface to Andrei's
 containers is ranges, and if my implementations are accepted, both
 will probably even be using the same underlying implementations.  So
 I think the two libs will quite happily exist together.  I am
 disappointed that dcollections wasn't chosen, but given the eventual
 API that Andrei has come up with, I think it didn't really have a
 shot from the beginning.

I much appreciate your kind willingness to essentially help push D forward before all else. During a private talk of a couple of days ago, Walter gave me a vote of confidence by essentially saying: "Whatever container library makes it in Phobos, I want to to be by your vision, not someone else's." This is nice to know, but at the same time piles a high responsibility because it puts me in position to decide on e.g. integration of dcollection into Phobos. Now say you put yourself in my place and Walter tells you that. You obviously have some idea on how you want a container library to be, because you wrote one! So most likely you'd put your design in, not mine or anyone else's, and you'd be using and accepting other designs and implementations only to the extent they satisfy your vision. This is all rhetorical because it's clear to me you have such an understanding of the situation, but it's a sort of a public rehashing of the dynamics that's going on right now. Andrei
May 26 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 06:04 PM, Steven Schveighoffer wrote:
 What about the purge function of dcollections (i.e. removal while
 iterating)?

I operated a few changes to the nomenclature to introduce better support for removal, in particular removal while iterating. In particular I tightened the complexity of remove and added linearRemove. http://erdani.com/d/phobos/std_container.html There are two idioms of choice: a) If the container supports remove(), then you can remove during iteration as follows: Range r = container[]; while (!r.empty) { if (remove_this_guy) r = container.remove(r.take(1)); else r.popFront(); } That's the case for many associative containers which have low cost for removal. Regular arrays won't be able to implement remove() because it must have logarithmic complexity. b) If the container doesn't support remove(), removal is done by packing towards the front the stuff to be kept, and then getting rid of it: Range r = container[]; Range yank = r; for (; !r.empty; r.popFront()) { if (!remove_this_guy) { yank.front = r.front; yank.popFront(); } } container.linearRemove(yank); See also the remove primitive in std.algorithm which does this (of course without the linearRemove call) using a predicate for remove_this_guy. Andrei
May 26 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

I feel this design is pretty close to what I really wanted.<

Good :-) How is opApply playing in the design of the containers? Can opSlice return a struct that defines opApply?
 There are a bunch of "soft" primitives. Those are meant to put a stop to 
 the iterator invalidation problems experienced in the STL.

I can suggest another thing, that doesn't replace this idea, but can make the nonsoft primitives a little safer when the program is compiled in nonrelease mode: http://d.puremagic.com/issues/show_bug.cgi?id=4179
any container must be a reference type, whether implemented as a class or
struct.<

This probably makes their usage simpler, so this can be the right decision. But then you can't define something like a TinyHashSet or a StaticBitSet that are better allocated on the stack... (a StaticBitSet is a struct template I have defined, at compile time you give it its length and it gets allocated on the stack, and represents a set of integers in an intgerval, or something similar). Why is length O(log(n))? If a linked list has no length field, to compute its length it has to scan all of it. make(): this has just killed the new :-) Among the standard primitives is it useful to have something to ask the actual computational complexity of a specific operation of a specific container? There are some ways to implement this, the simplest is to define enum with the most common complexities, and then a collection can contain an enum (const) value for each operation, something like: enum lengthComplexity = CompComplexity.linear; I don't know if this can be useful (to choose collections, to infer code complexity, etc). Bye, bearophile
May 25 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 06:18 PM, bearophile wrote:
 Andrei Alexandrescu:

 I feel this design is pretty close to what I really wanted.<

Good :-) How is opApply playing in the design of the containers? Can opSlice return a struct that defines opApply?

I hope to work with ranges only.
 There are a bunch of "soft" primitives. Those are meant to put a stop to
 the iterator invalidation problems experienced in the STL.

I can suggest another thing, that doesn't replace this idea, but can make the nonsoft primitives a little safer when the program is compiled in nonrelease mode: http://d.puremagic.com/issues/show_bug.cgi?id=4179

Will look into it, sounds like an implementation matter.
 any container must be a reference type, whether implemented as a class or
struct.<

This probably makes their usage simpler, so this can be the right decision. But then you can't define something like a TinyHashSet or a StaticBitSet that are better allocated on the stack... (a StaticBitSet is a struct template I have defined, at compile time you give it its length and it gets allocated on the stack, and represents a set of integers in an intgerval, or something similar).

I'm guessing some value container-like entities wouldn't harm if they can easily be converted to references.
 Why is length O(log(n))? If a linked list has no length field, to compute its
length it has to scan all of it.

I forgot the case I had in mind. I know that opSlice() must be O(log n) for e.g. tree traversals. Probably some trees could save some state if they exploit O(log n) length.
 make(): this has just killed the new :-)

 Among the standard primitives is it useful to have something to ask the actual
computational complexity of a specific operation of a specific container? There
are some ways to implement this, the simplest is to define enum with the most
common complexities, and then a collection can contain an enum (const) value
for each operation, something like:
 enum lengthComplexity = CompComplexity.linear;
 I don't know if this can be useful (to choose collections, to infer code
complexity, etc).

If it can be used gainfully, that would be great. Intriguing idea. Andrei
May 25 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei A.:

I hope to work with ranges only.<

Programming life is complex, you can't fit all of it in one schema ("A foolish consistency is the hobgoblin of little minds").
Will look into it, sounds like an implementation matter.<

Yes, right. But to implement this idea the foreach() has to change a bit, to set the flag in nonrelease mode. If implemented this idea lessens a bit the need for the stable ("soft") methods.
Probably some trees could save some state if they exploit O(log n) length.<

In general a length can be O(n) if for example you want to compute it on a linked list that doesn't keep the number of items inserted. Are you going to just not give a length attribute for such linked lists (so users have to use something like walkLength)? Bye, bearophile
May 26 2010
next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 05/26/2010 11:44 AM, bearophile wrote:
 Andrei A.:

 I hope to work with ranges only.<

Programming life is complex, you can't fit all of it in one schema ("A foolish consistency is the hobgoblin of little minds").

Couldn't you just define opApply where you need it and it will be used where foreach is used by default, anyway?
 Will look into it, sounds like an implementation matter.<

Yes, right. But to implement this idea the foreach() has to change a bit, to set the flag in nonrelease mode. If implemented this idea lessens a bit the need for the stable ("soft") methods.
 Probably some trees could save some state if they exploit O(log n) length.<

In general a length can be O(n) if for example you want to compute it on a linked list that doesn't keep the number of items inserted. Are you going to just not give a length attribute for such linked lists (so users have to use something like walkLength)?

Isn't that a good thing? If I know length is fast, I can use it without worries, but if it might be O(n) you need to avoid it.
May 26 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 07:44 AM, Pelle wrote:
 On 05/26/2010 11:44 AM, bearophile wrote:
 Andrei A.:

 I hope to work with ranges only.<

Programming life is complex, you can't fit all of it in one schema ("A foolish consistency is the hobgoblin of little minds").

Couldn't you just define opApply where you need it and it will be used where foreach is used by default, anyway?
 Will look into it, sounds like an implementation matter.<

Yes, right. But to implement this idea the foreach() has to change a bit, to set the flag in nonrelease mode. If implemented this idea lessens a bit the need for the stable ("soft") methods.
 Probably some trees could save some state if they exploit O(log n)
 length.<

In general a length can be O(n) if for example you want to compute it on a linked list that doesn't keep the number of items inserted. Are you going to just not give a length attribute for such linked lists (so users have to use something like walkLength)?

Isn't that a good thing? If I know length is fast, I can use it without worries, but if it might be O(n) you need to avoid it.

Yes, that's exactly the idea. BTW, I'd misunderstood the question. Bearophile asked "Why O(log n) and not O(n)?" and I heard: "Why O(log n) and not O(1)?" So I'd gotten confused a bit and answered the wrong question. Andrei
May 26 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 04:44 AM, bearophile wrote:
 Andrei A.:

 I hope to work with ranges only.<

Programming life is complex, you can't fit all of it in one schema ("A foolish consistency is the hobgoblin of little minds").
 Will look into it, sounds like an implementation matter.<

Yes, right. But to implement this idea the foreach() has to change a bit, to set the flag in nonrelease mode. If implemented this idea lessens a bit the need for the stable ("soft") methods.
 Probably some trees could save some state if they exploit O(log n)
 length.<

In general a length can be O(n) if for example you want to compute it on a linked list that doesn't keep the number of items inserted. Are you going to just not give a length attribute for such linked lists (so users have to use something like walkLength)?

Correct, some collections (notably SList) will simply not provide length(). In fact I will implement SList soon as an illustration of the design. Andrei
May 26 2010
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-25 19:18:43 -0400, bearophile <bearophileHUGS lycos.com> said:

 Andrei Alexandrescu:
 
 any container must be a reference type, whether implemented as a class 
 or struct.<

This probably makes their usage simpler, so this can be the right decision. But then you can't define something like a TinyHashSet or a StaticBitSet that are better allocated on the stack...

Well, in a way I think you can, but you have to stretch the definition a bit. A value-type container you can move but can't copy (because you used " disable this(this)") is semantically indistinguishable to a reference-type container with a unique non-copiable (but moveable) reference. The only problem is that most algorithms probably won't work with such a thing, they'll expect a copy of the reference right in their function arguments. This does bother me a little. That it allows statically allocated collections is something I like a lot of the C++ container model. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 26 2010
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element? 2. What's the affect of clear() on the capacity? 3. Shouldn't KeyTypes be a type tuple?
May 25 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.
 2. What's the affect of clear() on the capacity?

There is no guaranteed effect.
 3. Shouldn't KeyTypes be a type tuple?

Yes. At the end of the day you must be able to say KeyTypes!3 to get the fourth type there. You say it should be KeyTypes[3]. I see tuple trouble, sigh. With a template I feel I have more control. Andrei
May 25 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.
 2. What's the affect of clear() on the capacity?

There is no guaranteed effect.

Should say so in the spec.
 3. Shouldn't KeyTypes be a type tuple?

Yes. At the end of the day you must be able to say KeyTypes!3 to get the fourth type there. You say it should be KeyTypes[3]. I see tuple trouble, sigh. With a template I feel I have more control.

The reason I prefer a tuple for this is then I can ask KeyTypes.length, whereas with the template that's not specified by the spec.
May 25 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType? Andrei
May 25 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType?

Can a container have different ElementTypes from ValueTypes?
May 25 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 09:07 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType?

Can a container have different ElementTypes from ValueTypes?

For a hash K->V, KeyType is K, ValueType is V, and ElementType is Tuple!(K, V). Andrei
May 25 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 05/25/2010 09:07 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType?

Can a container have different ElementTypes from ValueTypes?

For a hash K->V, KeyType is K, ValueType is V, and ElementType is Tuple!(K, V).

Ok, that makes it clear, and it needs to be in the spec.
May 25 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 09:29 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 09:07 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType?

Can a container have different ElementTypes from ValueTypes?

For a hash K->V, KeyType is K, ValueType is V, and ElementType is Tuple!(K, V).

Ok, that makes it clear, and it needs to be in the spec.

Done. Andrei
May 26 2010
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 On 05/25/2010 09:07 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType?

Can a container have different ElementTypes from ValueTypes?

For a hash K->V, KeyType is K, ValueType is V, and ElementType is Tuple!(K, V). Andrei

How about simple arrays? There's a lot of similarity between T[] and T[int]...
May 26 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 07:38 AM, Jason House wrote:
 Andrei Alexandrescu Wrote:

 On 05/25/2010 09:07 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 On 05/25/2010 07:35 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

Great! Some nitpicky comments: 1. What's the difference between a value and an element?

None.

Then I suggest sticking with one or the other throughout the spec. Also, there's both an ElementType and a ValueType.

Sorry, I was wrong. ValueType is defined for keyed containers to mean the mapped type. Should I call it MappedType?

Can a container have different ElementTypes from ValueTypes?

For a hash K->V, KeyType is K, ValueType is V, and ElementType is Tuple!(K, V). Andrei

How about simple arrays? There's a lot of similarity between T[] and T[int]...

A simple array doesn't have a Tuple as the unit of storage. Indexing is a consequence of array's layout so I consider it different. Andrei
May 26 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/25/2010 08:31 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 2. What's the affect of clear() on the capacity?

There is no guaranteed effect.

Should say so in the spec.

I guess if it's missing then it's implied.
 3. Shouldn't KeyTypes be a type tuple?

Yes. At the end of the day you must be able to say KeyTypes!3 to get the fourth type there. You say it should be KeyTypes[3]. I see tuple trouble, sigh. With a template I feel I have more control.

The reason I prefer a tuple for this is then I can ask KeyTypes.length, whereas with the template that's not specified by the spec.

OK. Andrei
May 25 2010
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:
 
 http://erdani.com/d/phobos/std_container.html
 
 It's deceptively simple - the entire design is a nomenclature, really. 
 Any given container may choose to implement whichever primitives it can, 
 if (and only if) it can satisfy the requirements. Beyond that, each 
 container can define its own primitives.

Looks great to me. A couple of comments:
 There are a bunch of "soft" primitives. Those are meant to put a stop to 
 the iterator invalidation problems experienced in the STL. The container 
 implementor may alias softXyz to xyz if she knows the operation won't 
 mess the ranges currently iterating the container (which is the case for 
 most node-based containers). I haven't yet discussed subtler cases in 
 which a range starts with a removed element etc., but I plan to.

The softXXX primitives are listed as having the same complexity as the non-soft primitives. Should it be explicitly stated that non-soft should always be at least as fast as soft? (So that if you don't need the soft guarantee, you should always use the non-soft primitive?) Also, I agree with Jerry that something like "stable" would be more intuitive than "soft". I guess it's not exactly the same meaning as stableSort, but it's pretty close. A problem with the name "soft" is that it's a harder, stronger guarantee! It seems to be a concept of a "tame" removal as opposed to reckless, "savage" removal?
 
 So, this is it really: the containers specification is a collection of 
 capabilities. A given container picks the ones it can support 
 meaningfully, and user code has the specification as semantic and 
 complexity guarantees.
 
 This design is quite far removed from Steve's, which makes the 
 integration with dcollection tenuous. But at a minimum, maybe we can 
 work out something in which Steve offers, with credit, implementation 
 for this design and also offers dcollections as a separate library.
 
 
 Andrei

May 25 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 01:45 AM, Don wrote:
 Andrei Alexandrescu wrote:
 I've uploaded a work in progress on the container design here:

 http://erdani.com/d/phobos/std_container.html

 It's deceptively simple - the entire design is a nomenclature, really.
 Any given container may choose to implement whichever primitives it
 can, if (and only if) it can satisfy the requirements. Beyond that,
 each container can define its own primitives.

Looks great to me. A couple of comments:
 There are a bunch of "soft" primitives. Those are meant to put a stop
 to the iterator invalidation problems experienced in the STL. The
 container implementor may alias softXyz to xyz if she knows the
 operation won't mess the ranges currently iterating the container
 (which is the case for most node-based containers). I haven't yet
 discussed subtler cases in which a range starts with a removed element
 etc., but I plan to.

The softXXX primitives are listed as having the same complexity as the non-soft primitives. Should it be explicitly stated that non-soft should always be at least as fast as soft? (So that if you don't need the soft guarantee, you should always use the non-soft primitive?) Also, I agree with Jerry that something like "stable" would be more intuitive than "soft". I guess it's not exactly the same meaning as stableSort, but it's pretty close. A problem with the name "soft" is that it's a harder, stronger guarantee! It seems to be a concept of a "tame" removal as opposed to reckless, "savage" removal?

Renamed to "stable". Andrei
May 26 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-25 18:27:32 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 There are a bunch of "soft" primitives. Those are meant to put a stop 
 to the iterator invalidation problems experienced in the STL. The 
 container implementor may alias softXyz to xyz if she knows the 
 operation won't mess the ranges currently iterating the container 
 (which is the case for most node-based containers). I haven't yet 
 discussed subtler cases in which a range starts with a removed element 
 etc., but I plan to.

Looks good, but I think something is missing. While I agree with the idea of having distinguishable "soft" operations, to pass it to other functions you should be able to make a "soft" shell around your container and pass that shell mapping non-soft functions to soft ones, ensuring only soft functions can be called. It could look like this: insertAtRandomPlace(myContainer.soft, element); Now you know insertAtRandom won't and *can't* invalidate your iterators/ranges, and you don't need to write a separate "softInsertAtRandom" that only calls soft functions. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 26 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 09:37 AM, Michel Fortin wrote:
 On 2010-05-25 18:27:32 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 There are a bunch of "soft" primitives. Those are meant to put a stop
 to the iterator invalidation problems experienced in the STL. The
 container implementor may alias softXyz to xyz if she knows the
 operation won't mess the ranges currently iterating the container
 (which is the case for most node-based containers). I haven't yet
 discussed subtler cases in which a range starts with a removed element
 etc., but I plan to.

Looks good, but I think something is missing. While I agree with the idea of having distinguishable "soft" operations, to pass it to other functions you should be able to make a "soft" shell around your container and pass that shell mapping non-soft functions to soft ones, ensuring only soft functions can be called. It could look like this: insertAtRandomPlace(myContainer.soft, element); Now you know insertAtRandom won't and *can't* invalidate your iterators/ranges, and you don't need to write a separate "softInsertAtRandom" that only calls soft functions.

Initially I thought containers that naturally support soft (well, "stable" since 5 minutes ago) operations would simply alias the non-stable versions and the stable versions to the same call. But now you got me thinking that a container might want to take advantage of both to implement slightly different approaches. I haven't met such a container yet, but that doesn't mean they aren't possible. Andrei
May 26 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-26 11:13:38 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 05/26/2010 09:37 AM, Michel Fortin wrote:
 
 insertAtRandomPlace(myContainer.soft, element);
 
 Now you know insertAtRandom won't and *can't* invalidate your
 iterators/ranges, and you don't need to write a separate
 "softInsertAtRandom" that only calls soft functions.

Initially I thought containers that naturally support soft (well, "stable" since 5 minutes ago) operations would simply alias the non-stable versions and the stable versions to the same call. But now you got me thinking that a container might want to take advantage of both to implement slightly different approaches. I haven't met such a container yet, but that doesn't mean they aren't possible.

While it's true that a container could implement a different approach for stable and unstable operations, I was more concerned with the need to guaranty that only stable operations are performed when passing the container to other functions, hence the need for a "soft" (now "stable") container wrapper. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 26 2010
prev sibling parent reply Jonathan M Davis <jmdavisProg gmail.com> writes:
Looks interesting overall. There is one function, however, which makes no 
sense to me: removeElement()/stableRemoveElement().

So, it basically removes a random element from the container? It could be 
quite consistent as to which element it removes from the container (it being 
implementation-dependent), but effectively, it removes a random element? 
What's the point of that? I can't remember the last time - if ever - that I 
wanted to remove an element from a container and I didn't care which. Or am 
I misunderstanding what it does?

- Jonathan M Davis
May 26 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:
 There is one function, however, which makes no 
 sense to me: removeElement()/stableRemoveElement().

I have a similar function in my dlibs1, it's called pop (works with arrays and AAs), it's similar to the pop method function you find in Python sets and lists. On lists it removes and returns the last item:
 l = [1, 2, 3]
 l



 l.pop()



 l.pop()



 l.pop()



 l



But sets are not ordered, so it has to pop out items in a unpredictable way (but if you need items in a truly random order this is not the right thing to do):
 s = set(["ab", "cd", "ef"])
 s



 s.pop()



 s.pop()



 s



 s.pop()



 s



So I think this is an useful method. But I think its name is too much long and not clear enough, because it's not a delete, it also returns the item. So I think I'd like to call it just pop()/stablePop() :-) Bye, bearophile
May 26 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 06:07 PM, Jonathan M Davis wrote:
 Looks interesting overall. There is one function, however, which makes no
 sense to me: removeElement()/stableRemoveElement().

 So, it basically removes a random element from the container? It could be
 quite consistent as to which element it removes from the container (it being
 implementation-dependent), but effectively, it removes a random element?
 What's the point of that? I can't remember the last time - if ever - that I
 wanted to remove an element from a container and I didn't care which. Or am
 I misunderstanding what it does?

 - Jonathan M Davis

If the container is a worklist with items to work on, it sometimes doesn't matter in which order you extract them. Andrei
May 26 2010
next sibling parent reply Jonathan M Davis <jmdavisProg gmail.com> writes:
Andrei Alexandrescu wrote:

 On 05/26/2010 06:07 PM, Jonathan M Davis wrote:
 Looks interesting overall. There is one function, however, which makes no
 sense to me: removeElement()/stableRemoveElement().

 So, it basically removes a random element from the container? It could be
 quite consistent as to which element it removes from the container (it
 being implementation-dependent), but effectively, it removes a random
 element? What's the point of that? I can't remember the last time - if
 ever - that I wanted to remove an element from a container and I didn't
 care which. Or am I misunderstanding what it does?

 - Jonathan M Davis

If the container is a worklist with items to work on, it sometimes doesn't matter in which order you extract them. Andrei

Well, my first reaction would be that that would be needed rarely enough that there's no point in putting it in the API. You could just use removeFront() or removeBack(), or you could grab a random element from the container and remove that. But maybe there are container types where it really makes sense, and having it in the API could be useful. Still, it strikes me as a really weird function to have. Of course, if you really think that it's going to be useful, leave it in. Unlike pretty much all the others though, it's not one I ever expect to use. - Jonathan M Davis
May 26 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 08:30 PM, Jonathan M Davis wrote:
 Andrei Alexandrescu wrote:

 On 05/26/2010 06:07 PM, Jonathan M Davis wrote:
 Looks interesting overall. There is one function, however, which makes no
 sense to me: removeElement()/stableRemoveElement().

 So, it basically removes a random element from the container? It could be
 quite consistent as to which element it removes from the container (it
 being implementation-dependent), but effectively, it removes a random
 element? What's the point of that? I can't remember the last time - if
 ever - that I wanted to remove an element from a container and I didn't
 care which. Or am I misunderstanding what it does?

 - Jonathan M Davis

If the container is a worklist with items to work on, it sometimes doesn't matter in which order you extract them. Andrei

Well, my first reaction would be that that would be needed rarely enough that there's no point in putting it in the API. You could just use removeFront() or removeBack(), or you could grab a random element from the container and remove that.

SList can't implement removeBack() and Array can't implement removeFront(). Only a few containers can implement grabbing a random element within the complexity constraints of removeElement().
 But maybe there are container types where it
 really makes sense, and having it in the API could be useful. Still, it
 strikes me as a really weird function to have. Of course, if you really
 think that it's going to be useful, leave it in. Unlike pretty much all the
 others though, it's not one I ever expect to use.

It might not be weird if you want to write container-independent code. Andrei
May 26 2010
parent reply Jonathan M Davis <jmdavisProg gmail.com> writes:
Andrei Alexandrescu wrote:

 It might not be weird if you want to write container-independent code.

Well, I've never needed to do that particular operation on _any_ container, so it does strike me as weird regardless. I've basically always been looking to remove a specific element or elements or to remove the element at a specific location. I don't ever recall being in a situation where it made any sense to remove an element without caring which one. But that's obviously not to say that no one else would find it useful. And std.container does not and obviously should not revolve around what I alone would find useful. As for container-independent code, there are certainly times when I've written code that was container-independent, but I don't think that it's something that I've done all that often. I agree that it can be quite useful and powerful to do so, but generally I've found that it breaks down because the various containers are too disparate both in function and performance. For instance, the STL makes it so that you at least _think_ that you can write container-independent code, but it's quite easy to run into instances where you really want to use a function that's specific to a container instead of the version in the algorithm library, or the functions are just different enough between container types that it doesn't work, or the operation that you're trying to do works fine with one container but would invalidate iterators on another, or... The list goes on. Effective STL certainly advises you to not really write container-independent code, generally-speaking, because it doesn't really work. The fact that the operations in std.container carry specific performance constraints will make it work much better I think. Also, D's ability to alter how a function is defined based on the attributes of a given container makes it much easier to write algorithms which are efficient for each container type without having to worry about calling member functions in some cases and non-member functions in others or anything like that. So, I think that std.container really looks like it could lead me to write good, container-independent code. However, I don't have much experience with writing container-independent code because it doesn't work very well in C++, and it can be quite detrimental to performance in other languages like Java because the performance characteristics of different containers vary so much. In any case, std.container looks pretty good thus far. I just found removeElement() weird because I coudn't see why anyone would ever need such a function. - Jonathan M Davis P.S. I have to agree that removeAny() and removeAnyStable() are better names. It's certainly less confusing as to what they're doing.
May 26 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 11:15 PM, Jonathan M Davis wrote:
 The fact that the operations in std.container carry specific performance
 constraints will make it work much better I think. Also, D's ability to
 alter how a function is defined based on the attributes of a given container
 makes it much easier to write algorithms which are efficient for each
 container type without having to worry about calling member functions in
 some cases and non-member functions in others or anything like that. So, I
 think that std.container really looks like it could lead me to write good,
 container-independent code. However, I don't have much experience with
 writing container-independent code because it doesn't work very well in C++,
 and it can be quite detrimental to performance in other languages like Java
 because the performance characteristics of different containers vary so
 much.

Yeah, exactly. We're all in the same boat really. Time will tell how that really goes. Andrei
May 26 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 Well, I've never needed to do that particular operation on _any_ container, 
 so it does strike me as weird regardless. I've basically always been looking 
 to remove a specific element or elements or to remove the element at a 
 specific location.

You have probably missed my other answer. But I can add some more. That operation is common. You use it every time you have a container that doesn't define a deterministic order (like hash sets, hash associative arrays, and so on) and you want to process and consume its items. In such collection you can't ask to pop the last or first item because they are not defined. So you pop out one randomly. I have used it many times in my programs. Bye, bearophile
May 27 2010
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Jonathan M Davis:
 
 Well, I've never needed to do that particular operation on _any_ container, 
 so it does strike me as weird regardless. I've basically always been looking 
 to remove a specific element or elements or to remove the element at a 
 specific location.

You have probably missed my other answer. But I can add some more. That operation is common.

 You use it every time you have a container that doesn't define a deterministic
order (like hash sets, hash associative arrays, and so on) and you want to
process and consume its items. In such collection you can't ask to pop the last
or first item because they are not defined. So you pop out one randomly. I have
used it many times in my programs.

When is it better to do it that way, rather than just iterating over all elements, and then completely empty the container? (Just curious -- I'm having trouble thinking of a use case for this feature).
May 27 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Don:
 When is it better to do it that way, rather than just iterating over all 
 elements, and then completely empty the container?
 (Just curious -- I'm having trouble thinking of a use case for this 
 feature).

I'm having troubles understanding why two persons have troubles seeing use cases for this feature :-) Iterating over the container and then emptying the container is two operations, you have to keep in mind to empty it, while if you pop items out of it progressively you just need to keep in mind to do one thing, and you avoid forgetting the final cleaning. Also, the progressive popping out of items allows you to have a collection that is correct in every moment, there is no risk of "removing" two times an item, so you can pass around the data structure in any moment of this process of wearing it away. This is what you suggest (Python code): s = set(["ab", "bc", "de"]) def process_item(item): print item for item in s: process_item(item) s.clear() # removes all items But this is better, you can print the correct collection in any moment and you can't forget the final clear: s = set(["ab", "bc", "de"]) def process_item(item): print item def show_data(items): print items while s: process_item(s.pop()) show_data(s) Bye, bearophile
May 27 2010
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Don:
 When is it better to do it that way, rather than just iterating over all 
 elements, and then completely empty the container?
 (Just curious -- I'm having trouble thinking of a use case for this 
 feature).

I'm having troubles understanding why two persons have troubles seeing use cases for this feature :-) Iterating over the container and then emptying the container is two operations, you have to keep in mind to empty it, while if you pop items out of it progressively you just need to keep in mind to do one thing, and you avoid forgetting the final cleaning.

Yes, but if I understand correctly, the only reason to have removeAny _as a primitive_ is for speed. And iterating over the container followed by a single removal is almost always going to be much faster. If, however, speed is not critical, removeAny can be a generic function -- call removeFront() if present, else call removeBack(). And your examples would work just fine with that. I'm having trouble identifying a use case where it needs to be a primitive.
May 27 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Don:
 Yes, but if I understand correctly, the only reason to have removeAny 
 _as a primitive_ is for speed. And iterating over the container followed 
 by a single removal is almost always going to be much faster.

Most things in Python are designed to be handy first, and fast later. So I doubt performance has has any significance in this small piece of Python design. In both Python and D is pop() is useful because it allows your collection to represent coherently its decreased or increased number of items at all times (Andrei ha shown an example where you have to add and remove items continuously). Bye, bearophile
May 27 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/27/2010 06:49 AM, Don wrote:
 bearophile wrote:
 Jonathan M Davis:

 Well, I've never needed to do that particular operation on _any_
 container, so it does strike me as weird regardless. I've basically
 always been looking to remove a specific element or elements or to
 remove the element at a specific location.

You have probably missed my other answer. But I can add some more. That operation is common.

 You use it every time you have a container that doesn't define a
 deterministic order (like hash sets, hash associative arrays, and so
 on) and you want to process and consume its items. In such collection
 you can't ask to pop the last or first item because they are not
 defined. So you pop out one randomly. I have used it many times in my
 programs.

When is it better to do it that way, rather than just iterating over all elements, and then completely empty the container? (Just curious -- I'm having trouble thinking of a use case for this feature).

Again, any worklist-based algorithm will remove and add work items without minding for a specific order. http://cseweb.ucsd.edu/classes/sp00/cse231/report/node12.html Andrei
May 27 2010
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-26 20:09:15 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 05/26/2010 06:07 PM, Jonathan M Davis wrote:
 Looks interesting overall. There is one function, however, which makes no
 sense to me: removeElement()/stableRemoveElement().
 
 [...]

If the container is a worklist with items to work on, it sometimes doesn't matter in which order you extract them.

May I suggest naming it "removeAny"? This fits better with "removeBack" and "removeFront" (where the second word represent the position), and makes clear that you don't really care which element is removed. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 26 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/26/2010 09:29 PM, Michel Fortin wrote:
 On 2010-05-26 20:09:15 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 05/26/2010 06:07 PM, Jonathan M Davis wrote:
 Looks interesting overall. There is one function, however, which
 makes no
 sense to me: removeElement()/stableRemoveElement().

 [...]

If the container is a worklist with items to work on, it sometimes doesn't matter in which order you extract them.

May I suggest naming it "removeAny"? This fits better with "removeBack" and "removeFront" (where the second word represent the position), and makes clear that you don't really care which element is removed.

Done. removeAny was my choice as of a few months ago but I'd forgotten. Thanks! Andrei
May 26 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 Done. removeAny was my choice as of a few months ago but I'd forgotten. 

I suggest to call it just "pop". Bye, bearophile
May 27 2010
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Jonathan M Davis Wrote:

 Looks interesting overall. There is one function, however, which makes no 
 sense to me: removeElement()/stableRemoveElement().
 
 So, it basically removes a random element from the container? It could be 
 quite consistent as to which element it removes from the container (it being 
 implementation-dependent), but effectively, it removes a random element? 
 What's the point of that? I can't remember the last time - if ever - that I 
 wanted to remove an element from a container and I didn't care which. Or am 
 I misunderstanding what it does?

I think you are misunderstanding. Random element means you can't tell which one is removed, but it doesn't *have* to be truly random :) It most likely will be the most convenient element to remove (maybe that would be a better description). In other words, you can't expect it to always be the last element, or the first element, or the lowest element, or whatever. So essentially, I think that's what you were asking for, and I think that's what Andrei meant. -Steve
May 26 2010
parent Jonathan M Davis <jmdavisProg gmail.com> writes:
Steven Schveighoffer wrote:

 Jonathan M Davis Wrote:
 
 Looks interesting overall. There is one function, however, which makes no
 sense to me: removeElement()/stableRemoveElement().
 
 So, it basically removes a random element from the container? It could be
 quite consistent as to which element it removes from the container (it
 being implementation-dependent), but effectively, it removes a random
 element? What's the point of that? I can't remember the last time - if
 ever - that I wanted to remove an element from a container and I didn't
 care which. Or am I misunderstanding what it does?

I think you are misunderstanding. Random element means you can't tell which one is removed, but it doesn't *have* to be truly random :) It most likely will be the most convenient element to remove (maybe that would be a better description). In other words, you can't expect it to always be the last element, or the first element, or the lowest element, or whatever. So essentially, I think that's what you were asking for, and I think that's what Andrei meant. -Steve

I don't think that I misunderstood, but I may not have stated it well. No, the element is not _truly_ random, but it is removing an unspecified element which could be any element in the container, and is therefore random in the sense that you aren't telling it which one to remove. I've never been in a situation where it made any sense to do that. So, the function struck me as really weird. If you wanted truly random, you'd have to implement a function that actually did random number generation or whatnot to decide which element to pick, and presumably, it would be abnormal to use that here (though I think that doing so would still follow the API). So, no, removeElement() (now removeAny()) is not truly random, but it isn't deterministic from the perspective of the programmer having any clue which one will be removed, and it may or may not be deterministic from the container's perspective. - Jonathan M Davis
May 26 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 27 May 2010 00:20:24 -0400, Jonathan M Davis  
<jmdavisProg gmail.com> wrote:

 Steven Schveighoffer wrote:

 Jonathan M Davis Wrote:

 Looks interesting overall. There is one function, however, which makes  
 no
 sense to me: removeElement()/stableRemoveElement().

 So, it basically removes a random element from the container? It could  
 be
 quite consistent as to which element it removes from the container (it
 being implementation-dependent), but effectively, it removes a random
 element? What's the point of that? I can't remember the last time - if
 ever - that I wanted to remove an element from a container and I didn't
 care which. Or am I misunderstanding what it does?

I think you are misunderstanding. Random element means you can't tell which one is removed, but it doesn't *have* to be truly random :) It most likely will be the most convenient element to remove (maybe that would be a better description). In other words, you can't expect it to always be the last element, or the first element, or the lowest element, or whatever. So essentially, I think that's what you were asking for, and I think that's what Andrei meant. -Steve

I don't think that I misunderstood, but I may not have stated it well. No, the element is not _truly_ random, but it is removing an unspecified element which could be any element in the container, and is therefore random in the sense that you aren't telling it which one to remove. I've never been in a situation where it made any sense to do that. So, the function struck me as really weird. If you wanted truly random, you'd have to implement a function that actually did random number generation or whatnot to decide which element to pick, and presumably, it would be abnormal to use that here (though I think that doing so would still follow the API). So, no, removeElement() (now removeAny()) is not truly random, but it isn't deterministic from the perspective of the programmer having any clue which one will be removed, and it may or may not be deterministic from the container's perspective.

OK. I think the point of removeAny is that it removes an element as fast as possible, however that can be implemented by the container. I.e. removing the back or front element may not be the fastest operation, think about something like a tree, where the fastest element to remove is probably the top element. -Steve
May 27 2010
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Thu, May 27, 2010 at 5:44 AM, Don <nospam nospam.com> wrote:
 bearophile wrote:
 Don:
 When is it better to do it that way, rather than just iterating over all
 elements, and then completely empty the container?
 (Just curious -- I'm having trouble thinking of a use case for this
 feature).

I'm having troubles understanding why two persons have troubles seeing use cases for this feature :-) Iterating over the container and then emptying the container is two operations, you have to keep in mind to empty it, while if you pop items out of it progressively you just need to keep in mind to do one thing, and you avoid forgetting the final cleaning.

Yes, but if I understand correctly, the only reason to have removeAny _as a primitive_ is for speed. And iterating over the container followed by a single removal is almost always going to be much faster. If, however, speed is not critical, removeAny can be a generic function -- call removeFront() if present, else call removeBack(). And your examples would work just fine with that. I'm having trouble identifying a use case where it needs to be a primitive.

Think of a graph algorithm where you add all the nodes you know about to a Set. Pop one, process it, and then add any nodes it's connected to that you haven't seen yet back to the Set. Repeat until nothing left to pop. --bb
May 27 2010