www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Containers

reply "bitwise" <bitwise.pvt gmail.com> writes:
Any interest in having these in Phobos?

https://github.com/bitwise-github/d-containers

Phobos doesn't currently have a Queue(T), and Array(T) leaves 
much to be desired. The containers I've built are very full 
featured, fast, and are unittested fairly thoroughly. I intend to 
add range checking to both containers as well. Inspiration was 
taken from C++'s vector and queue, C#'s generic List and Queue, 
and D's Array.

I'm not sure how the container's I've built would be integrated 
though. They do go against the current container spec, but for 
good reason.

The container spec says containers should be reference types, but 
I guess this clashed with the idea of Phobos being  nogc so 
someone tried to make Array(T) ref counted. The result is 
std.container.Array being a struct with an instance of RefCounted 
inside it, which is bad for performance, but also inflexible. 
Innocent looking code like the following will do 2 separate 
allocations: One for the RefCounted payload, and one for the 
Array's data. On top of being a performance hit, it doesn't allow 
the user to choose how they want to manage memory.

Array!int a = Array!int(1, 2, 3);    //  2 allocations, or else!  
:D
The containers I've built are simple value types with a postblit. Using this as a base, one can simply use the container as-is if they like(which I do), but it's also trivial to make either a ref-counted version, or GC version. See here for example: https://github.com/bitwise-github/d-containers/blob/master/main.d Bit
Sep 03 2015
next sibling parent reply "bitwise" <bitwise.pvt gmail.com> writes:
On Thursday, 3 September 2015 at 19:45:48 UTC, bitwise wrote:

 I'm not sure how the container's I've built would be integrated
I suppose what I'm suggesting would be to integrate my new containers and modify the spec to explain the new value type containers, and start deprecating the old containers as better versions become available...starting with Array(T)... Or I could stop trying to make tangible contributions to D and just go bikeshed about =+ ;|
Sep 03 2015
parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 2015-09-03 at 21:11 +0000, bitwise via Digitalmars-d wrote:
 On Thursday, 3 September 2015 at 19:45:48 UTC, bitwise wrote:
 
 I'm not sure how the container's I've built would be integrated
I suppose what I'm suggesting would be to integrate my new containers and modify the spec to explain the new value type containers, and start deprecating the old containers as better versions become available...starting with Array(T)... Or I could stop trying to make tangible contributions to D and just go bikeshed about =+
Isn't the best route here to make a trivially accessible library (via the Dub repository?) that people can choose to use instead of the bits of Phobos that these data structures replace? This will then allow the momentum of usage to apply pressure for the Phobos ones to be deprecated and your new ones to be absorbed into Phobos… -- Russel. ============================================================================= Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.net 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Sep 04 2015
parent reply "wobbles" <grogan.colin gmail.com> writes:
On Friday, 4 September 2015 at 10:25:24 UTC, Russel Winder wrote:
 On Thu, 2015-09-03 at 21:11 +0000, bitwise via Digitalmars-d 
 wrote:
 On Thursday, 3 September 2015 at 19:45:48 UTC, bitwise wrote:
 
 I'm not sure how the container's I've built would be 
 integrated
I suppose what I'm suggesting would be to integrate my new containers and modify the spec to explain the new value type containers, and start deprecating the old containers as better versions become available...starting with Array(T)... Or I could stop trying to make tangible contributions to D and just go bikeshed about =+
Isn't the best route here to make a trivially accessible library (via the Dub repository?) that people can choose to use instead of the bits of Phobos that these data structures replace? This will then allow the momentum of usage to apply pressure for the Phobos ones to be deprecated and your new ones to be absorbed into Phobos…
I do think this is the best option for all new libraries that are to be potentially merged into phobos. It's how the python/Java world works too and I think they've done pretty well out of it.
Sep 04 2015
parent reply "bitwise" <bitwise.pvt gmail.com> writes:
On Friday, 4 September 2015 at 13:26:27 UTC, wobbles wrote:
 On Friday, 4 September 2015 at 10:25:24 UTC, Russel Winder 
 wrote:
 On Thu, 2015-09-03 at 21:11 +0000, bitwise via Digitalmars-d 
 wrote:
 On Thursday, 3 September 2015 at 19:45:48 UTC, bitwise wrote:
 
 I'm not sure how the container's I've built would be 
 integrated
I suppose what I'm suggesting would be to integrate my new containers and modify the spec to explain the new value type containers, and start deprecating the old containers as better versions become available...starting with Array(T)... Or I could stop trying to make tangible contributions to D and just go bikeshed about =+
Isn't the best route here to make a trivially accessible library (via the Dub repository?) that people can choose to use instead of the bits of Phobos that these data structures replace? This will then allow the momentum of usage to apply pressure for the Phobos ones to be deprecated and your new ones to be absorbed into Phobos…
I do think this is the best option for all new libraries that are to be potentially merged into phobos. It's how the python/Java world works too and I think they've done pretty well out of it.
What I meant by that comment, is how the process would go of differentiating between the new and the old containers, allowing them to coexist until the old ones were removed. I've since put my containers into a package called "collections" which would differentiate them from the current "containers". They could then have their own documentation without being subject to the current container spec. The current container spec has several problems. It specifies containers as reference types, but then goes on to explain how that's only half true, and tries to explain explain the quirks involved, and recommend using "make" to reconcile the problem. This is confusing, inconsistent, and inflexible. Containers should all be either real reference types(classes) or real value types(structs). Given the above two choices, I would choose structs. With structs, your options extend all the way down to primitive value types. You can house the struct in your own class or a RefCounted(T) with no unnecessary cost or hassle. With classes, there are only two choices, which are as-is classes(which use GC and have to be new'ed), or RefCounted classes(which incur cost of ref counting and the additional allocation for payload). The only benefit I see of using classes would be to allow containers to inherit common interfaces, which I don't think is all that useful. This is another problem: "Containers do not form a class hierarchy, instead they implement a common set of primitives (see table below). These primitives each guarantee a specific worst case complexity and thus allow generic code to be written independently of the container implementation." I believe this is wrong, in that the point of abstraction should be the Ranges you get from the containers, not the containers themselves. I think it's a little silly for an Array(T) to have a "removeAny()" method. Then, there is the idea of range validity. I strongly disagree with the way this idea is presented by the current spec. For example, Array(T) has "stableRemoveBack". This is misleading, because although your program _may_ not crash by using a range to a modified container, the range may be pointing at the wrong elements. Or worse, the program could still crash. Array(T) has stableRemoveBack(), but if you call it and then access a range that was pointing to the last element in the container, you get an out of range exception. I believe a better definition of range validity would be that the range pointed to the exact same elements after modification of the container. With a linked list, people would have to understand that ranges were a pair of iterators, and that removing either end point of the range from the container would invalidate it. On Friday, 4 September 2015 at 11:11:03 UTC, BBasile wrote:
 On Thursday, 3 September 2015 at 19:45:48 UTC, bitwise wrote:
[...]
I think that std.allocators is a prerequisite to implement the some new std containers.
agreed, it's on the todo list.
 New containers without std.mallocators would be an error. In 
 this sense, EMSI allocators are a bit more compliant (they 
 already use them, not exactly as required but   templates have 
 an optional param to indicate if the structures are GC-free or 
 not).
I looked at this, and I think this point should be abstracted away. I believe the allocators should expose the necessary information/traits to allow containers to know how to use them... possibly something like iterator tags in C++.
Sep 04 2015
next sibling parent reply "rsw0x" <anonymous anonymous.com> writes:
On Friday, 4 September 2015 at 22:21:16 UTC, bitwise wrote:
 It specifies containers as reference types, but then goes on to 
 explain how that's only half true, and tries to explain explain 
 the quirks involved, and recommend using "make" to reconcile 
 the problem. This is confusing, inconsistent, and inflexible. 
 Containers should all be either real reference types(classes) 
 or real value types(structs).
It's just another one of those major issues heavily ingrained in D that will never be fixed at the language level.
Sep 04 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 09/04/2015 06:36 PM, rsw0x wrote:
 On Friday, 4 September 2015 at 22:21:16 UTC, bitwise wrote:
 It specifies containers as reference types, but then goes on to
 explain how that's only half true, and tries to explain explain the
 quirks involved, and recommend using "make" to reconcile the problem.
 This is confusing, inconsistent, and inflexible. Containers should all
 be either real reference types(classes) or real value types(structs).
It's just another one of those major issues heavily ingrained in D that will never be fixed at the language level.
My thinking is that significant work in this(this) is poor D style. Eager copying for containers doesn't seem like the best way to go. -- Andrei
Sep 04 2015
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/05/2015 01:15 AM, Andrei Alexandrescu wrote:
 On 09/04/2015 06:36 PM, rsw0x wrote:
 On Friday, 4 September 2015 at 22:21:16 UTC, bitwise wrote:
 It specifies containers as reference types, but then goes on to
 explain how that's only half true, and tries to explain explain the
 quirks involved, and recommend using "make" to reconcile the problem.
 This is confusing, inconsistent, and inflexible. Containers should all
 be either real reference types(classes) or real value types(structs).
It's just another one of those major issues heavily ingrained in D that will never be fixed at the language level.
My thinking is that significant work in this(this) is poor D style. Eager copying for containers doesn't seem like the best way to go. -- Andrei
disable this(this) for ephemeral containers?
Sep 04 2015
parent "bitwise" <bitwise.pvt gmail.com> writes:
On Saturday, 5 September 2015 at 00:21:42 UTC, Timon Gehr wrote:
 On 09/05/2015 01:15 AM, Andrei Alexandrescu wrote:
 My thinking is that significant work in this(this) is poor D 
 style.
 Eager copying for containers doesn't seem like the best way to 
 go. --
 Andrei
disable this(this) for ephemeral containers?
I actually missed this message somehow, but this is what I'll do. A move-only container(base container) is fine by me. It'll probably end up sitting in a class, or the global scope anyways. Passing them around arbitrarily and joining them with "~" was not part of the plan. Finally, I can throw in one of these, and everybody wins: alias ListRef(T) = RefCounted!(List!T); Bit
Sep 06 2015
prev sibling next sibling parent "bitwise" <bitwise.pvt gmail.com> writes:
On Friday, 4 September 2015 at 23:15:54 UTC, Andrei Alexandrescu 
wrote:
 My thinking is that significant work in this(this) is poor D 
 style. Eager copying for containers doesn't seem like the best 
 way to go. -- Andrei
I would still prefer classes to the embedded RefCounted approach. It's more flexible. If a class is used, a user can wrap it in a RefCounted if they want, and eventually, if http://wiki.dlang.org/DIP74 works out, an extra template parameter could be provided to to enable the ref counting for the container without changing any of the caller's syntax. Example: class List(T, bool refCounted = false) { static if(refCounted) { // opAddRef(), opRelease(), etc.. } } I think my solution is still the base of something more robust, however. In the following example, all three storage types are allowed: struct ListBase(T) { } // represents my List(T) as currently implemented enum ListType { Stack, RefCounted, Heap } // defaults to RefCounted to avoid eager copying template List(T, ListType mode = ListType.RefCounted) { static if(mode == ListType.Stack) { alias List = ListBase!T; } else static if(mode == ListType.RefCounted) { alias List = RefCounted!(ListBase!T); } else { final class List { ListBase!T _list; alias _list this; this(Args...)(Args args) { _list = ListBase!T(args); } } } } void main(string[] args) { List!int a = [1, 2, 3]; // ref counted by default List!(int, ListType.Stack) b = [1, 2, 3]; // stack allocated auto c = new List!(int, ListType.Heap)([1, 2, 3]); // GC heap allocated }
Sep 04 2015
prev sibling parent reply "bitwise" <bitwise.pvt gmail.com> writes:
On Friday, 4 September 2015 at 23:15:54 UTC, Andrei Alexandrescu 
wrote:
 [...]
 My thinking is that significant work in this(this) is poor D 
 style. Eager copying for containers doesn't seem like the best 
 way to go. -- Andrei
I feel should add that I still very much favor simple value type containers. C++ does just fine with value-type containers. I know "D isn't C++" but if it works, it works. I think the D community often tries way too hard to be innovative, and in many cases, less is more. WYSIWYG is a much more important principal to me than whatever resulted in a RefCounted(T) that likes to dress up like a value type. Structs doing heavy work in this(this) may be bad D style, but structs that don't act like structs is worse.
Sep 04 2015
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 05-Sep-2015 05:36, bitwise wrote:
 On Friday, 4 September 2015 at 23:15:54 UTC, Andrei Alexandrescu wrote:
 [...]
 My thinking is that significant work in this(this) is poor D style.
 Eager copying for containers doesn't seem like the best way to go. --
 Andrei
I feel should add that I still very much favor simple value type containers. C++ does just fine with value-type containers. I know "D isn't C++" but if it works, it works.
I agree if only on the grounds of composability. Value-typed containers are "more" then ref-based ones. As we can always get: Unqiue!ValueTypedContainer or RC!ValueTypedContainer on top of value type container.
 I think the D community often tries way too hard
 to be innovative, and in many cases, less is more. WYSIWYG is a much
 more important principal to me than whatever resulted in a RefCounted(T)
 that likes to dress up like a value type. Structs doing heavy work in
 this(this) may be bad D style, but structs that don't act like structs
 is worse.
-- Dmitry Olshansky
Sep 04 2015
parent "bitwise" <bitwise.pvt gmail.com> writes:
On Saturday, 5 September 2015 at 06:44:30 UTC, Dmitry Olshansky 
wrote:
 On 05-Sep-2015 05:36, bitwise wrote:
 On Friday, 4 September 2015 at 23:15:54 UTC, Andrei 
 Alexandrescu wrote:
 [...]
I feel should add that I still very much favor simple value type containers. C++ does just fine with value-type containers. I know "D isn't C++" but if it works, it works. [...]
I agree if only on the grounds of composability. Value-typed containers are "more" then ref-based ones. As we can always get: Unqiue!ValueTypedContainer or RC!ValueTypedContainer on top of value type container.
or even a class! =) static import collections; final class List(T) { private collections.List!T _list; alias _list this; this(Args...)(Args args) { _list = collections.List!T(args); } }
Sep 05 2015
prev sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 2015-09-04 at 22:21 +0000, bitwise via Digitalmars-d wrote:
=20
[=E2=80=A6]
 This is another problem:
    "Containers do not form a class hierarchy, instead they=20
 implement a common set of primitives (see table below). These=20
 primitives each guarantee a specific worst case complexity and=20
 thus allow generic code to be written independently of the=20
 container implementation."
=20
 I believe this is wrong, in that the point of abstraction should=20
 be the Ranges you get from the containers, not the containers=20
 themselves. I think it's a little silly for an Array(T) to have a=20
 "removeAny()" method.
=20
=20
[=E2=80=A6] I think I am missing a step in the argument here: if the point is that Ranges are the core of the abstraction, why does there need to be a class inheritance hierarchy. The lessons from C++, Java, Python, and Go, include that the obsession with inheritance during the 1990s went far too far. Inheritance can be useful, but for frameworks it is only conformance to interfaces that matters for consistency across the framework. Thus as long as every data structure type conforms to the correct interface so that there is composability in the use of Ranges,=20 then things are fine. This though is philosophy and far too general. To continue this aspect of the debate we would need to deal in more specific things. Are there some examples from Phobos and your containers that we could pin this on? --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Sep 04 2015
parent "bitwise" <bitwise.pvt gmail.com> writes:
On Saturday, 5 September 2015 at 06:59:28 UTC, Russel Winder 
wrote:
 On Fri, 2015-09-04 at 22:21 +0000, bitwise via Digitalmars-d 
 wrote:
 This is another problem:

 --->FROM THE D DOCS
    "Containers do not form a class hierarchy, instead they
 implement a common set of primitives (see table below). These
 primitives each guarantee a specific worst case complexity and
 thus allow generic code to be written independently of the
 container implementation."
I believe this is wrong, in that the point of abstraction should be the Ranges you get from the containers, not the containers themselves. I think it's a little silly for an Array(T) to have a "removeAny()" method.
I think I am missing a step in the argument here:
Sorry.. I'm not sure if this is the source of confusion, but I was actually quoting the D documentation there. I've modified my message to make the quote more visible. That quote doesn't represent a point I agree with, but rather a point I'm trying to shoot down.
 if the point is that Ranges are the core of the abstraction, why
 does there need to be a class inheritance hierarchy.The lessons
 from C++, Java, Python, and Go, include that the obsession with 
 inheritance during the 1990s went far too far. Inheritance can 
 be useful, but for frameworks it is only conformance to 
 interfaces that matters for consistency across the framework. 
 Thus as long as every data structure type conforms to the 
 correct interface so that there is composability in the use of 
 Ranges, then things are fine.
I am saying that there should _not_ be a class hierarchy. Currently, there is no class hierarchy, but if you read that section I quoted from the container specification, it says that all containers should follow a pattern which allows algorithms to work on any container indiscriminately, which basically has the same effect as if all containers were forced to inherit the same interface. I believe this is over-generalization, and that the abstraction point should be backed-up to ranges. Trying to make a multi-map have the same interface as a vector or array, so that algorithms can be made to work on all containers indiscriminately, is a pipe dream. It's much more reasonable to expect algorithms to be able to work on any range, regardless of which container it has come from.
 This though is philosophy and far too general. To continue this 
 aspect of the debate we would need to deal in more specific 
 things. Are there some examples from Phobos and your containers 
 that we could pin this on?
With the containers I've built, I try to add overloads that take a ranges wherever possible(pushBack, push, insert, find, etc..). The difference between my approach and the one suggested by the current container spec is that the spec would force me to name "pushBack" and "push" the same thing, even though the first one was a member function of a list/array/vector, and the second one was a member of a stack. Containers may turn out to have similar interfaces, but it's not a reason to rigidly define an interface to which all containers must adhere. Bit
Sep 05 2015
prev sibling parent "BBasile" <bb.temp gmx.com> writes:
On Thursday, 3 September 2015 at 19:45:48 UTC, bitwise wrote:
 Any interest in having these in Phobos?

 https://github.com/bitwise-github/d-containers

 Phobos doesn't currently have a Queue(T), and Array(T) leaves 
 much to be desired. The containers I've built are very full 
 featured, fast, and are unittested fairly thoroughly. I intend 
 to add range checking to both containers as well. Inspiration 
 was taken from C++'s vector and queue, C#'s generic List and 
 Queue, and D's Array.

 I'm not sure how the container's I've built would be integrated 
 though. They do go against the current container spec, but for 
 good reason.

 The container spec says containers should be reference types, 
 but I guess this clashed with the idea of Phobos being  nogc so 
 someone tried to make Array(T) ref counted. The result is 
 std.container.Array being a struct with an instance of 
 RefCounted inside it, which is bad for performance, but also 
 inflexible. Innocent looking code like the following will do 2 
 separate allocations: One for the RefCounted payload, and one 
 for the Array's data. On top of being a performance hit, it 
 doesn't allow the user to choose how they want to manage memory.

 Array!int a = Array!int(1, 2, 3);    //  2 allocations, or 
 else!  >:D

 The containers I've built are simple value types with a 
 postblit. Using this as a base, one can simply use the 
 container as-is if they like(which I do), but it's also trivial 
 to make either a ref-counted version, or GC version.

 See here for example:
 https://github.com/bitwise-github/d-containers/blob/master/main.d

     Bit
I think that std.allocators is a prerequisite to implement the some new std containers. Examples: - library array: gc_allocator for a "single shot" program is fine. - library array: aligned_allocator if the array content has to be used with several SSE instructions. - linked list: could use malloctor to automatically manage its payloads life-time, but the final choice will be a parameter of the template instance. Inside the template, some 'static if' branches to adapt the code to the allocator used to make new payloads. Also a the free list to get available payloads instead of allocationg...etc. New containers without std.mallocators would be an error. In this sense, EMSI allocators are a bit more compliant (they already use them, not exactly as required but templates have an optional param to indicate if the structures are GC-free or not).
Sep 04 2015