digitalmars.D - Container hierarchy vs. container types
- Andrei Alexandrescu (95/95) Mar 04 2010 Hi everyone,
- bearophile (7/10) Mar 04 2010 In your post there's lot of stuff to think about. I can see you are tryi...
- Andrei Alexandrescu (10/24) Mar 04 2010 Good point, perfect. I agree with that, and I meant to massage that
- Sandwich (5/33) Mar 04 2010 /*Me ends lurking*/
- Andrei Alexandrescu (3/10) Mar 04 2010 Thanks. Wonder how many other knowledgeable lurkers are out there :o).
- Walter Bright (2/4) Mar 04 2010 Actually, thank you for making a concentrated message.
- Robert Jacques (11/11) Mar 04 2010 On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu
- Andrei Alexandrescu (3/16) Mar 04 2010 It's intentional. You don't want to use a container as a range by mistak...
- Michel Fortin (18/33) Mar 05 2010 I agree that "put" has different semantics than "push" so the name
- Jonathan M Davis (4/7) Mar 05 2010 Probably because it would become way too easy to accidentally use the
- Michel Fortin (13/19) Mar 05 2010 Yes, I understand that.
- Jonathan M Davis (18/39) Mar 05 2010 It would probably be better in such a case to have a range wrap the
- Andrei Alexandrescu (8/28) Mar 05 2010 A range is a view on a container. As such, popping off a range does not
- Michel Fortin (21/28) Mar 05 2010 That's a fine definition, except for the word "topology".
- Andrei Alexandrescu (17/47) Mar 05 2010 It's essential. The way ranges work, they can never change the topology
- Jerry Quinn (25/60) Mar 04 2010 I've rarely found it helpful that an ArrayList is a List, or that HashMa...
- Jonathan M Davis (43/43) Mar 04 2010 I'd definitely go with the unconnected class solution. Sometimes it's us...
- dsimcha (3/27) Mar 04 2010 This is why I hate nominative typing
- Walter Bright (17/22) Mar 04 2010 I've never, in 35 years of programming, ever had a need to change the
- BLS (12/18) Mar 05 2010 I would like to see : collection update events. The C# C5 library is
- Jacob Carlborg (2/20) Mar 05 2010
- Andrei Alexandrescu (4/16) Mar 05 2010 It's a nice touch. I hope at least to allow that kind of customization f...
- Steven Schveighoffer (80/83) Mar 05 2010 As you know, you and I disagree on some aspects of containers. However,...
- Andrei Alexandrescu (29/61) Mar 05 2010 [snip]
- Steven Schveighoffer (26/57) Mar 05 2010 A range is not a container. A range is a small reference-like construct...
- Andrei Alexandrescu (9/78) Mar 05 2010 I agree. Interface ranges are unfortunately a poor practical proposition...
- Steven Schveighoffer (10/16) Mar 05 2010 I disagree. I generally use containers for collecting items for later
- Jonathan M Davis (30/31) Mar 05 2010 I concur with this. It rarely makes sense to outright copy containers, a...
- dolive (3/3) Mar 06 2010 Would like to put a major innovation to the 3.0, Developed quickly 2.0:...
- Denis Koroskin (4/8) Mar 06 2010 Use translate.google.com to express your ideas better!
- dolive (5/20) Mar 06 2010 haha, thanks !
- dolive (5/20) Mar 06 2010 haha, thanks !
Hi everyone, I'm on a trip to Romania visiting relatives. Being disconnected for a while, I thought some about container types and started writing some code. It didn't take long to figure that a classic class hierarchy with inheritance etc. would contain prohibitively many types (with awkward names too), or would lose a lot of expressiveness. There are just so many types of containers, and they have so different characteristics. My second shot was with a capability-based architecture suggested (in an unrelated context) by Scott Meyers (http://www.artima.com/cppsource/codefeatures.html - good article, recommended regardless). Some capabilities I came up with were: enum ContCaps : uint { hasLength = 1, hasAssignableLength = 2, hasPushBack = 4, hasRemoveBack = 8, hasPushFront = 16, hasRemoveFront = 32, linear = 64, hasFront = 128, hasBack = 256, randomAccess = 512, } A container would look like this: interface Container(T, uint caps = 0) ... The trick is that any container could contain more or less any free combination of capabilities. There may be a few combinations that don't quite make sense, but most do. The capability-based design prescribes that a container with N capabilities inherits each and every container with N-1 capabilities obtained by chopping away one capability in turn. For example, a container with Container!(int, 19) would inherit Container!(int, 18), Container!(int, 17), and Container!(int, 3). That was a bit difficult to implement, but I pulled it off with a string mixin. The problem is that the number of effective base interfaces grows exponentially with the number of capabilities. (That's a problem also experienced by Scott.) If I instantiated a Container with all capabilities, the compiler would hang forever. If I had 6-7 capabilities, the code would compile after a while but would generate interfaces that have size 140KB-260KB excluding payload, which is prohibitive. The joys of inheritance hierarchies! The design doesn't even address an important aspect that I'm still mulling over - how do I model whether the container exposes lvalues or is completely encapsulated? For example a trie is unable to expose the lvalues of the arrays it holds because it distributes storage. But wait, there's more. Containers must expose ranges which face similar issues. Here are my incomplete range capabilities: enum RangeCaps { rvalueElem = 1, assignableElem = 2, swappableElem = 4, infinite = 8, noLength = 16, } These capabilities are orthogonal on the fundamental range kinds (input, forward, double-ended, random). So we're looking at not one thick hierarchy, but two! I decided to let no assumption unchallenged and got back to the question: do we really need a container _hierarchy_? How much joy and usefulness do people derive from dynamically changing an array with a list under the same interface? Or a rb-tree vs. a binary tree vs. a left-leaning binary tree? As far as I can tell such changes of containers are a design-time decision, and extremely rarely (never as far as I remember) a runtime decision to hide behind a binary interface. I'm starting to think that it's quite the other way around, and a Sapir-Whorf kind of thing: The fact that Java lacks symbol aliasing capability conditions the thinking that a hierarchy is the right thing to do, whereas the true solution is to design a flat federation of container types with name- and semantics-compatible primitives, to then use them directly or via design-time aliases. Obviously I might be wrong in any number of ways. Switching containers dynamically might be very useful. The precision, flexibility, and beauty I'm aiming for may be useless. Container hierarchies may have some other awesomeness I'm failing to see. But for the time being I'm very seriously considering a simple design: * Each specific container is an unconnected class (or struct, haven't decided yet - probably structs to accommodate reference-counted containers); * Naming matches dictate semantic matches. There should be no false friends - if a container defines removeFront, that should mean the same thing as for other containers (i.e. O(1) removal of the first element). This design follows and enriches STL's design even after we have, as a community, seen many hierarchy-based designs, and even though it looks like hierarchies have "won". The main improvements I want to bring to STL's design are (a) a better categorization of containers, (b) of course replace iterators with ranges, (c) either eliminate allocators or make them work. Needless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too. Andrei
Mar 04 2010
Andrei Alexandrescu:As far as I can tell such changes of containers are a design-time decision, and extremely rarely (never as far as I remember) a runtime decision to hide behind a binary interface.In your post there's lot of stuff to think about. I can see you are trying to innovate, so here's a comment about data structures of the future. In Gcc 4.5 they have added a profile mode: http://gcc.gnu.org/onlinedocs/libstdc++/manual/profile_mode.html It gives you simple suggestions, based on simple statistics collected at run-time, about data structures choice. Today data structures are mostly chosen at compile time by the programmer, but once the runtime has those statistics collected at run-time, you can think of feeding that data into the program itself at runtime, so data structures can adapt themselves. This can be useful for programs that run on servers, for minutes, hours or days, where they have to face different workloads as time passes. Bye, bearophile
Mar 04 2010
bearophile wrote:Andrei Alexandrescu:Good point, perfect. I agree with that, and I meant to massage that point within the message. A structure that's an e.g. set that adapts its strategy dynamically depending on load is a perfect example. I still don't think that's the job of a hierarchy - it's the encapsulation of an algebraic type together with a strategy for choosing the type currently used. Apologies for the dense message. I wanted to concentrate as much info in as little time/space as possible. AndreiAs far as I can tell such changes of containers are a design-time decision, and extremely rarely (never as far as I remember) a runtime decision to hide behind a binary interface.In your post there's lot of stuff to think about. I can see you are trying to innovate, so here's a comment about data structures of the future. In Gcc 4.5 they have added a profile mode: http://gcc.gnu.org/onlinedocs/libstdc++/manual/profile_mode.html It gives you simple suggestions, based on simple statistics collected at run-time, about data structures choice. Today data structures are mostly chosen at compile time by the programmer, but once the runtime has those statistics collected at run-time, you can think of feeding that data into the program itself at runtime, so data structures can adapt themselves. This can be useful for programs that run on servers, for minutes, hours or days, where they have to face different workloads as time passes. Bye, bearophile
Mar 04 2010
/*Me ends lurking*/ As far as allocators go, you may want to take a look at how eastl handles them: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html#eastl_allocator /*Resumes Lurking*/ Andrei Alexandrescu Wrote:bearophile wrote:Andrei Alexandrescu:Good point, perfect. I agree with that, and I meant to massage that point within the message. A structure that's an e.g. set that adapts its strategy dynamically depending on load is a perfect example. I still don't think that's the job of a hierarchy - it's the encapsulation of an algebraic type together with a strategy for choosing the type currently used. Apologies for the dense message. I wanted to concentrate as much info in as little time/space as possible. AndreiAs far as I can tell such changes of containers are a design-time decision, and extremely rarely (never as far as I remember) a runtime decision to hide behind a binary interface.In your post there's lot of stuff to think about. I can see you are trying to innovate, so here's a comment about data structures of the future. In Gcc 4.5 they have added a profile mode: http://gcc.gnu.org/onlinedocs/libstdc++/manual/profile_mode.html It gives you simple suggestions, based on simple statistics collected at run-time, about data structures choice. Today data structures are mostly chosen at compile time by the programmer, but once the runtime has those statistics collected at run-time, you can think of feeding that data into the program itself at runtime, so data structures can adapt themselves. This can be useful for programs that run on servers, for minutes, hours or days, where they have to face different workloads as time passes. Bye, bearophile
Mar 04 2010
Sandwich wrote:/*Me ends lurking*/ As far as allocators go, you may want to take a look at how eastl handles them: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html#eastl_allocator /*Resumes Lurking*/Thanks. Wonder how many other knowledgeable lurkers are out there :o). Andrei
Mar 04 2010
Andrei Alexandrescu wrote:Apologies for the dense message. I wanted to concentrate as much info in as little time/space as possible.Actually, thank you for making a concentrated message.
Mar 04 2010
On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote: [snip] I like this idea. I've gotten used to using template constraints and ranges when coding which matches this strategy naturally. A lot of the capabilities seem to match ranges, but why did you use different names for some of them? removeBack : popBack removeFront: popFront pushBack : put pushFront : putFront?
Mar 04 2010
Robert Jacques wrote:On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote: [snip] I like this idea. I've gotten used to using template constraints and ranges when coding which matches this strategy naturally. A lot of the capabilities seem to match ranges, but why did you use different names for some of them? removeBack : popBack removeFront: popFront pushBack : put pushFront : putFront?It's intentional. You don't want to use a container as a range by mistake. Andrei
Mar 04 2010
On 2010-03-05 01:53:32 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Robert Jacques wrote:I agree that "put" has different semantics than "push" so the name should be different, but what's the difference in semantics between "remove" and "pop"? Both remove an element from the subject (range or container). I think they should share the same name. You say you don't want to mix ranges with containers. But a container somewhat has the same semantics as an input stream: if you remove/pop something form it, it's no longer there for anyone. You decided to implement input streams as ranges, but at the same time you avoid reusing the same interface for containers; isn't that a little contradictory? Why couldn't a container have the same interface as a range, only with the added ability to push elements into it? -- Michel Fortin michel.fortin michelf.com http://michelf.com/On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote: [snip] I like this idea. I've gotten used to using template constraints and ranges when coding which matches this strategy naturally. A lot of the capabilities seem to match ranges, but why did you use different names for some of them? removeBack : popBack removeFront: popFront pushBack : put pushFront : putFront?It's intentional. You don't want to use a container as a range by mistake.
Mar 05 2010
Why couldn't a container have the same interface as a range, only with the added ability to push elements into it?Probably because it would become way too easy to accidentally use the container as a range, which in many cases would end up emptying it when passing it to one of the std algorithms. - Jonathan M Davis
Mar 05 2010
On 2010-03-05 11:32:27 -0500, Jonathan M Davis <jmdavisProg gmail.com> said:Yes, I understand that. My point is that the problem already exists with ranges: by exposing a stream as a range, you'll empty the stream when iterating over it. If that's acceptable for streams, why is it not for containers? And perhaps you actually want to empty the container as you iterate over it. If you use your container as a queue, that's what you'll expect. In fact, a stream is just like a queue, and you want algorithms to work on streams don't you? -- Michel Fortin michel.fortin michelf.com http://michelf.com/Why couldn't a container have the same interface as a range, only with the added ability to push elements into it?Probably because it would become way too easy to accidentally use the container as a range, which in many cases would end up emptying it when passing it to one of the std algorithms.
Mar 05 2010
Michel Fortin wrote:On 2010-03-05 11:32:27 -0500, Jonathan M Davis <jmdavisProg gmail.com> said:It would probably be better in such a case to have a range wrap the container. That way, you can have range semantics for the container if you want them, but you don't risk using them on a container where you don't. As for streams, I don't know. I do like the idea of using ranges with streams, but I haven't really looked at what the downsides to that might be. It depends a lot on what the std algorithms do. With some it might be fine while with others it would be a really bad idea. I'd have to look into it more to say, but at least with a stream, it's expected that iterating over it will alter it. With a container, that's not usually the case. And in cases where it is, either that container could be a special case and be a range as well, or it you could have a wrapper range do it. Personally, I'd prefer the wrapper idea. It would probably allow you to create generic wrappers to use on pretty much any container (or at least containers with a particular set of functions) rather than special casing particular containers. Regardless, I think that making containers ranges in the general case is a bad idea. - Jonathan M DavisYes, I understand that. My point is that the problem already exists with ranges: by exposing a stream as a range, you'll empty the stream when iterating over it. If that's acceptable for streams, why is it not for containers? And perhaps you actually want to empty the container as you iterate over it. If you use your container as a queue, that's what you'll expect. In fact, a stream is just like a queue, and you want algorithms to work on streams don't you?Why couldn't a container have the same interface as a range, only with the added ability to push elements into it?Probably because it would become way too easy to accidentally use the container as a range, which in many cases would end up emptying it when passing it to one of the std algorithms.
Mar 05 2010
Michel Fortin wrote:On 2010-03-05 11:32:27 -0500, Jonathan M Davis <jmdavisProg gmail.com> said:A range is a view on a container. As such, popping off a range does not affect the topology of the underlying container. Consider, for contrast, a balanced binary tree. Removing an element triggers a rebalance of the tree, but popping an element off an iterator in that tree does not affect the tree itself. Fair enough? AndreiYes, I understand that. My point is that the problem already exists with ranges: by exposing a stream as a range, you'll empty the stream when iterating over it. If that's acceptable for streams, why is it not for containers? And perhaps you actually want to empty the container as you iterate over it. If you use your container as a queue, that's what you'll expect. In fact, a stream is just like a queue, and you want algorithms to work on streams don't you?Why couldn't a container have the same interface as a range, only with the added ability to push elements into it?Probably because it would become way too easy to accidentally use the container as a range, which in many cases would end up emptying it when passing it to one of the std algorithms.
Mar 05 2010
On 2010-03-05 16:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:A range is a view on a container. As such, popping off a range does not affect the topology of the underlying container. Consider, for contrast, a balanced binary tree. Removing an element triggers a rebalance of the tree, but popping an element off an iterator in that tree does not affect the tree itself. Fair enough?That's a fine definition, except for the word "topology". Whether it changes the topology should be an implementation detail. Popping of a C++ vector or a hash table doesn't change the topology, will you make them ranges? That doesn't make much sense. Also, do you consider that popping off a stream affects its topology? If your stream is buffered, you're probably advancing a pointer through a buffer with 'pop', and deallocating old buffers from time to time. So this definition would forbid a stream from being a range. That doesn't make much sense. And similarly you could have a queue container where you insert things at the back and pop things from the front. But that's basically the same thing as a stream. Is a queue a container or a range? I think this definition is more on the spot: a range is something you consume, a container is something used to hold data. Those definition are not mutually exclusive, but is this really a problem? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 05 2010
Michel Fortin wrote:On 2010-03-05 16:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:It's essential. The way ranges work, they can never change the topology of the collection they operate on. That is a fundamental aspect. A container operation may or may not affect its topology.A range is a view on a container. As such, popping off a range does not affect the topology of the underlying container. Consider, for contrast, a balanced binary tree. Removing an element triggers a rebalance of the tree, but popping an element off an iterator in that tree does not affect the tree itself. Fair enough?That's a fine definition, except for the word "topology". Whether it changes the topology should be an implementation detail. Popping of a C++ vector or a hash table doesn't change the topology, will you make them ranges? That doesn't make much sense.Also, do you consider that popping off a stream affects its topology?A stream has no topology I can think of. Within this context, I define topology loosely as the way the data of a container is stored and interconnected in memory.If your stream is buffered, you're probably advancing a pointer through a buffer with 'pop', and deallocating old buffers from time to time. So this definition would forbid a stream from being a range. That doesn't make much sense. And similarly you could have a queue container where you insert things at the back and pop things from the front. But that's basically the same thing as a stream. Is a queue a container or a range? I think this definition is more on the spot: a range is something you consume, a container is something used to hold data. Those definition are not mutually exclusive, but is this really a problem?I think it is. For starters, copying a forward range copies its state, whereas copying a container and then removing stuff from it results in two containers with less net state. By and large, clearly there are important distinctions between ranges and containers. My perception is that you do have that understanding, but are doing a Plato on me. I'm a bit tired of having a Plato done on me, particularly in this case when I don't have the answers. Would be great to strengthen the weak definitions we have now instead of challenging them. Andrei
Mar 05 2010
Andrei Alexandrescu Wrote:I decided to let no assumption unchallenged and got back to the question: do we really need a container _hierarchy_? How much joy and usefulness do people derive from dynamically changing an array with a list under the same interface? Or a rb-tree vs. a binary tree vs. a left-leaning binary tree?As far as I can tell such changes of containers are a design-time decision, and extremely rarely (never as far as I remember) a runtime decision to hide behind a binary interface. I'm starting to think that it's quite the other way around, and a Sapir-Whorf kind of thing: The fact that Java lacks symbol aliasing capability conditions the thinking that a hierarchy is the right thing to do, whereas the true solution is to design a flat federation of container types with name- and semantics-compatible primitives, to then use them directly or via design-time aliases.I've rarely found it helpful that an ArrayList is a List, or that HashMap is a Map. If an API I'm passing to takes the base class, it could easily have been done as a template. And dynamic switching hasn't ever been useful to me. As long as the API is the same, recompiling will work and the containers will continue to behave well. If dynamic switching is really necessary for an application, it can be done by wrapping the container with a class hierarchy as needed.Obviously I might be wrong in any number of ways. Switching containers dynamically might be very useful. The precision, flexibility, and beauty I'm aiming for may be useless. Container hierarchies may have some other awesomeness I'm failing to see. But for the time being I'm very seriously considering a simple design: * Each specific container is an unconnected class (or struct, haven't decided yet - probably structs to accommodate reference-counted containers);Also, structs would reduce the indirections required to access the data, when placing them inside other aggregates as well as reduce memory allocation traffic. I'd rather not pay for 2 allocations to create an instance of: class C { vector!(int) v; }* Naming matches dictate semantic matches. There should be no false friends - if a container defines removeFront, that should mean the same thing as for other containers (i.e. O(1) removal of the first element).This I would hesitate a bit on. I recently changed STL maps for custom hashmaps and not changing operator[] for different code is useful. I'm changing the computational complexity of an operation but conceptually it's the same kind of access. I'm doing so because I know that for my application, the tradeoffs elsewhere are acceptable.This design follows and enriches STL's design even after we have, as a community, seen many hierarchy-based designs, and even though it looks like hierarchies have "won". The main improvements I want to bring to STL's design are (a) a better categorization of containers, (b) of course replace iterators with ranges, (c) either eliminate allocators or make them work. Needless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too.The concept of allocators is nice in principal, though I've never had to put them into action. A couple of places I could see having used STL allocators: wrapping JNI memory or GC memory in an STL container. Neither is particularly necessary in D. Another place I could envision an allocator is if I load a large static data structure with mmap and want to wrap an STL structure around it. I've wanted to do this in particular with strings. This is where C++ allocators fall down at least. Trying to mix custom allocated data structures with standard ones is going to be painful. Jerry
Mar 04 2010
I'd definitely go with the unconnected class solution. Sometimes it's useful to be able to have a class hierarchy where you don't have to care whether about things like whether a Map is a HashMap or a SortedMap, but it frequently comes back to bite you - especially when they require that different operations be implemented by the elements they hold. It can easily lead to an inefficient use of containers - especially after changing one type for another. Additionally, from a performance perspective, by avoiding the class hierarchy, it's possible for them to be final and allow inlining and such. So, there are performance gains by avoiding the class hiercharchy. Also, I recall Scott Meyers talking in Effective STL how containers aren't truly interchangeable and how you can get yourself into trouble when you try and swap one out for another. You really should be aware of what container type your using and what it's strengths and weaknesses are. What it generally comes down to is that you want containers to have the same functions when they do the same thing, so that it's possible to swap one for another when you want to. The same API is generally enough for that. The fact that D has auto makes it that much easier. Being able to swap out vector and list or ArrayList and LinkedList can be useful in some circumstances, but having a List which could be either (or yet another unknown type) is not all that useful, generally speaking. And, as I said before, it makes it really easy to use such containers in inefficient ways since the performance of each operation of each operation can differ significantly depending on the type. It also leads to using only the functions which are the lowest common denominator, which means that the containers are not used to their full potential and are probably being used less efficiently. If you have a set of unconnected container classes and someone wants a hierarchy of containers for their project, it would be quite simple to set up a set of wrapper classes for the containers that they want to be in a hierarchy. On the other hand, you can't take a class hierarchy and use a wrapper class to turn it into a flat set of unconnected containers. I say that it makes sense for functions with the same behavior to have the same names, so that you can effectively duck type the containers, but I don't see much benefit in putting them in a hierarchy where you supposedly don't have to care about what the actual container is that you're using as long as it implements a particular interface. In any case, I'm glad to see that you're finally working on the container problem. I think that it's easily the thing that phobos lacks most, and it would definitely be easier to get people to use D, if there were containers. Most programmers I know would probably not touch D as long as there are no standard containers. It'll be good once we have them. And hopefully, they'll be awesome. - Jonathan M Davis
Mar 04 2010
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleHi everyone, I'm on a trip to Romania visiting relatives. Being disconnected for a while, I thought some about container types and started writing some code. It didn't take long to figure that a classic class hierarchy with inheritance etc. would contain prohibitively many types (with awkward names too), or would lose a lot of expressiveness. There are just so many types of containers, and they have so different characteristics. My second shot was with a capability-based architecture suggested (in an unrelated context) by Scott Meyers (http://www.artima.com/cppsource/codefeatures.html - good article, recommended regardless). Some capabilities I came up with were: enum ContCaps : uint { hasLength = 1, hasAssignableLength = 2, hasPushBack = 4, hasRemoveBack = 8, hasPushFront = 16, hasRemoveFront = 32, linear = 64, hasFront = 128, hasBack = 256, randomAccess = 512, }This is why I hate nominative typing (http://en.wikipedia.org/wiki/Nominative_type_system).
Mar 04 2010
Andrei Alexandrescu wrote:I decided to let no assumption unchallenged and got back to the question: do we really need a container _hierarchy_? How much joy and usefulness do people derive from dynamically changing an array with a list under the same interface? Or a rb-tree vs. a binary tree vs. a left-leaning binary tree?I've never, in 35 years of programming, ever had a need to change the container type at runtime. I don't think the thought has ever even crossed my mind. I appreciate bearophile's point of runtime profiling being used to change the container, but I suggest that such containers be internally self-modifying, rather than externally using user-visible virtual functions. I more and more like the idea of having a set of names defined by convention that have conventional behaviors, in other words, duck typing. It's infinitely extendable and flexible without needing to modify the core library in any way. It also fits right in with D's template constraints. The bit flag approach is not pretty. Furthermore, to add a capability means one has to redo that enum in the core library, which is not scalable and people don't like making their own custom core libraries. Even if they did, it makes for mutually incompatible third party libraries. Yuk.
Mar 04 2010
On 05/03/2010 00:22, Andrei Alexandrescu wrote:I want to bring to STL's design are (a) a better categorization of containers, (b) of course replace iterators with ranges, (c) either eliminate allocators or make them work.(afaik) the only container library which implements this very useful feature. (example a database informs connected users about relevant changes to force update the GUI, no need to pull just in case every 5 seconds) Beside, Implementing update events could be a good reason to give std.pattern a new try... http://www.itu.dk/research/c5/Please speak up - the future of D depends, in small part, on this too. AndreiI think it is a big win for the D2 community to have a container library. I.E. porting C++ will be more comfortable. Bjoern
Mar 05 2010
On 3/5/10 10:44, BLS wrote:On 05/03/2010 00:22, Andrei Alexandrescu wrote:That looks really nice to have.I want to bring to STL's design are (a) a better categorization of containers, (b) of course replace iterators with ranges, (c) either eliminate allocators or make them work.(afaik) the only container library which implements this very useful feature. (example a database informs connected users about relevant changes to force update the GUI, no need to pull just in case every 5 seconds)Beside, Implementing update events could be a good reason to give std.pattern a new try... http://www.itu.dk/research/c5/Please speak up - the future of D depends, in small part, on this too. AndreiI think it is a big win for the D2 community to have a container library. I.E. porting C++ will be more comfortable. Bjoern
Mar 05 2010
== Quote from Jacob Carlborg (doob me.com)'s articleOn 3/5/10 10:44, BLS wrote:It's a nice touch. I hope at least to allow that kind of customization from the outside. AndreiOn 05/03/2010 00:22, Andrei Alexandrescu wrote:That looks really nice to have.I want to bring to STL's design are (a) a better categorization of containers, (b) of course replace iterators with ranges, (c) either eliminate allocators or make them work.(afaik) the only container library which implements this very useful feature. (example a database informs connected users about relevant changes to force update the GUI, no need to pull just in case every 5 seconds)
Mar 05 2010
On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Needless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too.As you know, you and I disagree on some aspects of containers. However, I think our philosophies are converging. Having built a container library, I can tell you the one and only reason I made it have an interface hierarchy -- Tango. Tango had a container library, which was based on ancient Doug Lea collections, with some added features. Because I intended dcollections to replace Tango's collections, I tried to encompass the same feature set that it had. One of those features was the ability to use interfaces without knowing the implementation. Since then, Tango has replaced their container collection with something different, and guess what -- no more interface hierarchy. They have a single interface ICollection, and all containers implement that interface. No Map interface, or List interface or anything. However, although I think generic containers can avoid *requiring* an interface, if you design your classes well, slapping on an interface costs almost nothing. It depends on whether you wish to use classes or structs to implement containers. I still think classes are better, because modifying one aspect of a class is easy to do. If someone wants to make a new type of HashMap that does everything my HashMap does, except changes one little bit, it's really easy. With the advent of alias this, it's also possible to do something similar with structs, but not as straightforward, and not without recompiling. Also, passing containers around by value by default is one of the aspects of STL that I think sucks. When working with STL, I almost never passed around a container by value, I always used a reference, because passing by value can incur large hidden-allocation costs. I'll go over a quick set of points that are pro-interface. First, using an interface hides the implementation. It may not be possible to have your code on display for the compiler to use. Using an interface is a perfectly acceptable way to hide proprietary code that you cannot legally divulge. This is probably the weakest of the points, but I put it out there. Second, D is a statically compiled language, but with the (hopefully soon) evolution to dynamic linking, using an interface is ideal. If you for instance wish to pass a map to or from a plugin library, using an interface is probably the best way to do it. Interfaces are less susceptible to implementation changes/differences. Third, code that uses an interface is compiled once per interface. Code that uses duck typing is compiled once per set of arguments. While this might not seem like much, it can reduce the footprint of generated code. Using duck typing, you may have two almost identical generated functions that differ only by the function addresses used. Finally, interfaces simplify understanding. Once you have used an interface, you know "oh yeah, this is a map, so I can use it like a map." You can strive to build a container library that follows those principles, even making assertions that force the compiler to prove those principles, but it's not as easy for a person to understand as it is to look at interface documentation and know what it does. This becomes important when using libraries that use special implementations of containers. Like for instance a database result or an XML tree. Interfaces in other languages can be viewed as advantageous in other ways, but D has advanced compile-time interfaces so far that those don't really matter in D. For example, declaring that a function requires a map container can be done with duck typing via conditional compilation. At the same time, just like I think ranges don't fit every model (*cough* I/O), interfaces aren't the answer to every aspect of containers. I don't think ranges fit well with interfaces, because iterating interface ranges prevents inlining -- the major draw of ranges in the first place -- and ranges are so much more useful with value semantics. I also think functions that can be tuned to each implementation should be. For this reason, dcollections containers provide a lot of functionality that is not included in the interfaces, simply because the functions are so specific to the implementation, it would be the only class that implemented that interface. For example, all the functions that return cursors (and soon ranges) are not interface functions. This doesn't make them useless via interfaces, but you cannot use every aspect of the container via an interface. An interface is like a common denominator, and I think it should be useful for some purposes. If nobody will ever use the interface as a parameter to a function, then there is no point in declaring the interface (I realize that I have created such interfaces in dcollections and I plan to correct that -- one nice benefit of the contemplation triggered by this discussion). I am working on updating dcollections as we post, and I think I have come up with a nifty way to do ranges that will both retain the usefulness of the cursor, and provide a common way to plug the collections into std.algorithm. Good luck with your containers, I still hold out hope that dcollections can be integrated in Phobos, but I probably need to get it working in order to have any chance of competing :) -Steve
Mar 05 2010
Steven Schveighoffer wrote:On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:[snip] To summarize my understanding of your points: (a) When you only want to tweak a container's behavior in small ways, inheritance is great. (b) Pass by reference is more natural for containers than pass by value. (c) Proprietary code hiding is better with interfaces. (d) Dynamically-linked code works best with interfaces. (e) Interface-based code requires less recompilation and may save on generated code. (f) Interfaces simplify understanding. At the same time, you argue that ranges aren't fit as interfaces because: (a) Inlinining is difficult/impossible (b) Value semantics is more natural for ranges (c) I didn't understand this because you changed the subject from ranges back to containers:Needless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too.As you know, you and I disagree on some aspects of containers. However, I think our philosophies are converging.I also think functions that can be tuned to each implementation should be. For this reason, dcollections containers provide a lot of functionality that is not included in the interfaces, simply because the functions are so specific to the implementation, it would be the only class that implemented that interface. For example, all the functions that return cursors (and soon ranges) are not interface functions. This doesn't make them useless via interfaces, but you cannot use every aspect of the container via an interface. An interface is like a common denominator, and I think it should be useful for some purposes. If nobody will ever use the interface as a parameter to a function, then there is no point in declaring the interface (I realize that I have created such interfaces in dcollections and I plan to correct that -- one nice benefit of the contemplation triggered by this discussion).One other thing I'm worried about, and was an absolute pain in my preliminary tests of container implementations, is this issue of "iterators/ranges/cursors are not part of the container interface". It is absolutely fundamental that you can get an abstract range from an abstract container because most of what you can do to a container you must do with ranges. Otherwise the duplication of functionality is horrid. This failure alone is a reason to abandon the container hierarchy design.I am working on updating dcollections as we post, and I think I have come up with a nifty way to do ranges that will both retain the usefulness of the cursor, and provide a common way to plug the collections into std.algorithm. Good luck with your containers, I still hold out hope that dcollections can be integrated in Phobos, but I probably need to get it working in order to have any chance of competing :)Good luck to you too. As I specified in my private message - if you undo the fundamental mistake of making find()/contains() part of the container (and of course all other disastrous mistakes of that kind), then I'm sure dcollections will be a very solid library. Andrei
Mar 05 2010
On Fri, 05 Mar 2010 11:53:47 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote:A range is not a container. A range is a small reference-like construct that points to a containers elements. I stand by what I said, a container is more natural as a reference, ranges are more natural with value semantics. If I pass an abstract range into something like std.algorithm.sort, the performance is going to be much worse than using a value-type range. The implementation has the option of doing the latter, so why not make it an interface function? In addition, std.algorithm.sort is going to expect value-type ranges, so you have to make copies everywhere, incurring more performance hits.On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:[snip] To summarize my understanding of your points: (a) When you only want to tweak a container's behavior in small ways, inheritance is great. (b) Pass by reference is more natural for containers than pass by value. (c) Proprietary code hiding is better with interfaces. (d) Dynamically-linked code works best with interfaces. (e) Interface-based code requires less recompilation and may save on generated code. (f) Interfaces simplify understanding. At the same time, you argue that ranges aren't fit as interfaces because: (a) Inlinining is difficult/impossible (b) Value semantics is more natural for rangesNeedless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too.As you know, you and I disagree on some aspects of containers. However, I think our philosophies are converging.One other thing I'm worried about, and was an absolute pain in my preliminary tests of container implementations, is this issue of "iterators/ranges/cursors are not part of the container interface". It is absolutely fundamental that you can get an abstract range from an abstract container because most of what you can do to a container you must do with ranges. Otherwise the duplication of functionality is horrid. This failure alone is a reason to abandon the container hierarchy design.It depends on what you want to do with the container. std.algorithm is not the only reason to have containers. Duplication of functionality is unavoidable if you do not have abstract ranges. However, by "duplication" it most likely means forwarding calls to range-accepting functions. For example, a List interface may have a sort function. The implementation can forward to std.algorithm.sort if the list is an array, or to an internal merge sort if it is a linked list (in fact, this is similar to what I do now in dcollections). I don't really know how you would do this without knowing one is a linked list and one isn't. It's possible to take a flat non-interfaced collection library and build an interface on top of it, I don't see why it detracts from the original non-interface API. -Steve
Mar 05 2010
Steven Schveighoffer wrote:On Fri, 05 Mar 2010 11:53:47 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I wasn't debating the point, just organizing ideas.Steven Schveighoffer wrote:A range is not a container. A range is a small reference-like construct that points to a containers elements. I stand by what I said, a container is more natural as a reference, ranges are more natural with value semantics.On Thu, 04 Mar 2010 18:22:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:[snip] To summarize my understanding of your points: (a) When you only want to tweak a container's behavior in small ways, inheritance is great. (b) Pass by reference is more natural for containers than pass by value. (c) Proprietary code hiding is better with interfaces. (d) Dynamically-linked code works best with interfaces. (e) Interface-based code requires less recompilation and may save on generated code. (f) Interfaces simplify understanding. At the same time, you argue that ranges aren't fit as interfaces because: (a) Inlinining is difficult/impossible (b) Value semantics is more natural for rangesNeedless to say, I'd be very curious to hear other opinions in this matter. Please speak up - the future of D depends, in small part, on this too.As you know, you and I disagree on some aspects of containers. However, I think our philosophies are converging.If I pass an abstract range into something like std.algorithm.sort, the performance is going to be much worse than using a value-type range. The implementation has the option of doing the latter, so why not make it an interface function? In addition, std.algorithm.sort is going to expect value-type ranges, so you have to make copies everywhere, incurring more performance hits.I agree. Interface ranges are unfortunately a poor practical proposition.It's not the only reason, but it is the primordial reason - just like "transportation" is a reason for buying a car. It would be a failure to define containers that can't work with std.algorithm.One other thing I'm worried about, and was an absolute pain in my preliminary tests of container implementations, is this issue of "iterators/ranges/cursors are not part of the container interface". It is absolutely fundamental that you can get an abstract range from an abstract container because most of what you can do to a container you must do with ranges. Otherwise the duplication of functionality is horrid. This failure alone is a reason to abandon the container hierarchy design.It depends on what you want to do with the container. std.algorithm is not the only reason to have containers.Duplication of functionality is unavoidable if you do not have abstract ranges. However, by "duplication" it most likely means forwarding calls to range-accepting functions. For example, a List interface may have a sort function. The implementation can forward to std.algorithm.sort if the list is an array, or to an internal merge sort if it is a linked list (in fact, this is similar to what I do now in dcollections). I don't really know how you would do this without knowing one is a linked list and one isn't.The sort example is good, but I wonder how many of those are out there.It's possible to take a flat non-interfaced collection library and build an interface on top of it, I don't see why it detracts from the original non-interface API.Perfect. So I take it we can focus on the federation of containers approach? Andrei
Mar 05 2010
On Fri, 05 Mar 2010 16:16:15 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote:I disagree. I generally use containers for collecting items for later lookup. That's generally a container-specific function that can be part of an interface. Other functions I usually use are iteration, which again can be part of the interface. Besides, making containers have interfaces does not preclude them from using std.algorithm. I agree std.algorithm is definitely a reason to have containers, but that's not mutually exclusive with interfaces. -SteveIt depends on what you want to do with the container. std.algorithm is not the only reason to have containers.It's not the only reason, but it is the primordial reason - just like "transportation" is a reason for buying a car. It would be a failure to define containers that can't work with std.algorithm.
Mar 05 2010
Andrei Alexandrescu wrote:(b) Pass by reference is more natural for containers than pass by value.I concur with this. It rarely makes sense to outright copy containers, and when you do, you're explicitly looking to do so. If a container were a struct, then returning it from a function would be expensive. And it could get really messy trying to have one container as an element in another - particularly if you can't get a reference to the element in the container. You'd have to copy it out of the container, modify it, and copy it back in again. That's way too much unnecessary copying. And it would be easy to think that after you'd changed the element after getting it out of the container that it had changed the one in the container when you hadn't. Container!(Container!(E)) cont; ... auto element = cont.get(x); element.add(someValue); //With a class, the element in the container would now be altered. vs Container!(Container!(E)) cont; ... auto element = cont.get(x); //this involves a copy element.add(someValue); cont.set(x, element); //this involves a copy Now, by getting at the elements of a container by reference, that reduces the problem, but generally speaking, it makes no sense to be copying containers, and it's expensive to do so. I'd vote to just make them final classes. You gain the efficiency of inlining and have reference semantics which is really what typically makes sense for containers. If you want it copied, you can clone it or construct a new one with the elements of the first one. I really don't see what would be gained by making a container a struct. - Jonathan M Davis
Mar 05 2010
Would like to put a major innovation to the 3.0, Developed quickly 2.0: Build the common standard library, Repair a large number of bugs. thank you every much ! dolive89
Mar 06 2010
On Sat, 06 Mar 2010 21:41:48 +0300, dolive <dolive89 sina.com> wrote:Would like to put a major innovation to the 3.0, Developed quickly 2.0: Build the common standard library, Repair a large number of bugs. thank you every much ! dolive89Use translate.google.com to express your ideas better! http://translate.google.com/#en|zh-CN|Use%20translate.google.com%20to%20express%20your%20ideas%20better! http://translate.google.com/#zh-CN|en|%E4%BD%BF%E7%94%A8translate.google.com%E8%A1%A8%E9%81%94%E6%82%A8%E7%9A%84%E6%83%B3%E6%B3%95%E6%9B%B4%E5%A5%BD%EF%BC%81
Mar 06 2010
Denis Koroskin дµ½:On Sat, 06 Mar 2010 21:41:48 +0300, dolive <dolive89 sina.com> wrote:haha, thanks ! Want to major innovations into 3.0 to achieve, and quickly develop 2.0: Development of common base library, a large number of bugs fixed. Can read do? : ) Looking forward to ddmd !Would like to put a major innovation to the 3.0, Developed quickly 2.0: Build the common standard library, Repair a large number of bugs. thank you every much ! dolive89Use translate.google.com to express your ideas better! http://translate.google.com/#en|zh-CN|Use%20translate.google.com%20to%20express%20your%20ideas%20better! http://translate.google.com/#zh-CN|en|%E4%BD%BF%E7%94%A8translate.google.com%E8%A1%A8%E9%81%94%E6%82%A8%E7%9A%84%E6%83%B3%E6%B3%95%E6%9B%B4%E5%A5%BD%EF%BC%81
Mar 06 2010
Denis Koroskin дµ½:On Sat, 06 Mar 2010 21:41:48 +0300, dolive <dolive89 sina.com> wrote:haha, thanks ! Want to major innovations into 3.0 to achieve, and quickly develop 2.0: Development of common base library, a large number of bugs fixed. Can read do? : ) Looking forward to ddmd !Would like to put a major innovation to the 3.0, Developed quickly 2.0: Build the common standard library, Repair a large number of bugs. thank you every much ! dolive89Use translate.google.com to express your ideas better! http://translate.google.com/#en|zh-CN|Use%20translate.google.com%20to%20express%20your%20ideas%20better! http://translate.google.com/#zh-CN|en|%E4%BD%BF%E7%94%A8translate.google.com%E8%A1%A8%E9%81%94%E6%82%A8%E7%9A%84%E6%83%B3%E6%B3%95%E6%9B%B4%E5%A5%BD%EF%BC%81
Mar 06 2010