www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - dcollections 1.0 and 2.0a beta released

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
After much work and toil, I have created dcollections for D2.  I think I  
can say that this is the first collection package available for D2.  I  
still hold hope that it will be used as the standard collection package  
for phobos for D2.

Please visit http://www.dsource.org/projects/dcollections for details.

Some VERY IMPORTANT notes for this release:

1. it is very beta software.  Although the implementation has changed very  
little from the 1.0 version, not everything will properly compile.  I have  
filed several compiler bugs to fix some problems I had while porting, and  
inserted workarounds for things I could work around.  Please use the  
LATEST compiler (currently 2.046) because some very important features  
have been added that dcollections relies on.  Although it is beta, the  
algorithms and implementation is largely unchanged so functionality should  
be pretty solid.

2. the docs are not online for d2.  Unfortunately, dsource's  
auto-doc-generator that I was using for the 1.0 version is based on a 1.0  
compiler, so it will not automatically generate the d2 docs.  However, I  
have included some basic ddoc generated docs in the download package,  
don't expect them to be very fun to use :)

3. The docs are not fully updated, so they might be just flat out wrong!   
I plan to update the docs completely for the next beta release.

So, for the good stuff... here are some notes for differences between 1.0  
and 2.0:

     * Ranges!
     * Filled out slicing for all containers
     * Cursors are now non-movable.  Use ranges for safe iteration.
     * The interfaces have been cut down significantly. The question I  
asked myself when deciding whether I wanted to keep an interface is "would  
anyone use this interface?"
     * Functions that should be fast but can be slow (O(n)) have been  
removed from all interfaces, and in most cases, from the containers.  
Notably missing is searching a non-quick lookup container for a value. Use  
std.algorithm.find for that.
     * ArrayMultiset has been removed -- it's complexity functions did not  
fit with the multiset requirements (specifically, quick lookup of an  
element's presence).
     * ArrayList slicing now simply returns a range instead of a "live"  
slice.  Note that an ArrayList range is aliased to a D array.
     * Full unit tests added for all container types.

I have also changed the license from BSD to Boost.


--------

I have also formally released version 0.03 as version 1.0, nothing has  
changed there, including the license.  However, no new features will be  
added to dcollections version 1.0, only bug fixes will be done.

Please, file tickets on dsource for any problems you have.  And let me  
know if there are any ideas you think would be useful.

-Steve
May 19 2010
next sibling parent "Yao G." <nospamyao gmail.com> writes:
Thanks Steven. I was waiting for this for a long time.


On Wed, 19 May 2010 11:09:11 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 After much work and toil, I have created dcollections for D2.  I think I  
 can say that this is the first collection package available for D2.  I  
 still hold hope that it will be used as the standard collection package  
 for phobos for D2.

 Please visit http://www.dsource.org/projects/dcollections for details.

 Some VERY IMPORTANT notes for this release:

 1. it is very beta software.  Although the implementation has changed  
 very little from the 1.0 version, not everything will properly compile.   
 I have filed several compiler bugs to fix some problems I had while  
 porting, and inserted workarounds for things I could work around.   
 Please use the LATEST compiler (currently 2.046) because some very  
 important features have been added that dcollections relies on.   
 Although it is beta, the algorithms and implementation is largely  
 unchanged so functionality should be pretty solid.

 2. the docs are not online for d2.  Unfortunately, dsource's  
 auto-doc-generator that I was using for the 1.0 version is based on a  
 1.0 compiler, so it will not automatically generate the d2 docs.   
 However, I have included some basic ddoc generated docs in the download  
 package, don't expect them to be very fun to use :)

 3. The docs are not fully updated, so they might be just flat out  
 wrong!  I plan to update the docs completely for the next beta release.

 So, for the good stuff... here are some notes for differences between  
 1.0 and 2.0:

      * Ranges!
      * Filled out slicing for all containers
      * Cursors are now non-movable.  Use ranges for safe iteration.
      * The interfaces have been cut down significantly. The question I  
 asked myself when deciding whether I wanted to keep an interface is  
 "would anyone use this interface?"
      * Functions that should be fast but can be slow (O(n)) have been  
 removed from all interfaces, and in most cases, from the containers.  
 Notably missing is searching a non-quick lookup container for a value.  
 Use std.algorithm.find for that.
      * ArrayMultiset has been removed -- it's complexity functions did  
 not fit with the multiset requirements (specifically, quick lookup of an  
 element's presence).
      * ArrayList slicing now simply returns a range instead of a "live"  
 slice.  Note that an ArrayList range is aliased to a D array.
      * Full unit tests added for all container types.

 I have also changed the license from BSD to Boost.


 --------

 I have also formally released version 0.03 as version 1.0, nothing has  
 changed there, including the license.  However, no new features will be  
 added to dcollections version 1.0, only bug fixes will be done.

 Please, file tickets on dsource for any problems you have.  And let me  
 know if there are any ideas you think would be useful.

 -Steve

-- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
May 19 2010
prev sibling next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Are the collections supposed to not have isEmpty members?

Looking forward to using dcollections, and here's to hoping they make it 
into phobos.

OTish: What are your thoughts on a bimap implementation and a 
child/sibling or general tree implementation as part of dcollections?
May 19 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

No. Use length == 0. O(1) length is always supported for all collections.
 OTish: What are your thoughts on a bimap implementation and a 
 child/sibling or general tree implementation as part of dcollections?

I haven't the slightest what a bimap is :) I am not an expert in collections or data structures, I just reimplement things I have understood. My implementations are basically copied from my algorithm book, and refined as much as I can do. That being said, if you have any implementation of a tree or hash, it should be easy to insert into dcollections. If you have ideas for other collection types (i.e. other than Map, Set, Multiset or List), then I can look into that if you point me at an implementation or have one of your own. I purposefully left out multi-map because I've never had a huge use for it, and it seemed like a awkward thing to create an interface for... -Steve
May 19 2010
next sibling parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

No. Use length == 0. O(1) length is always supported for all collections.
 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of dcollections?

I haven't the slightest what a bimap is :) I am not an expert in collections or data structures, I just reimplement things I have understood. My implementations are basically copied from my algorithm book, and refined as much as I can do.

I think boost.bimap is where I saw it, though I don't don't use c++. I think it's a map, with values->keys is also a map
 That being said, if you have any implementation of a tree or hash, it should
be easy to insert into dcollections.  If you have ideas for other collection
types (i.e. other than Map, Set, Multiset or List), then I can look into that
if you point me at an implementation or have one of your own.  I purposefully
left out multi-map because I've never had a huge use for it, and it seemed like
a awkward thing to create an interface for...

 -Steve

I have a simple child/sibling tree implementation which I could probably dust off and polish up if you want it. The method for visiting the elements is kind of weird, though. And I don't know that it exactly fits the mold of a reference container. Maybe with cursors. Ugh, I just noticed LinkList doesn't work with interfaces.
May 19 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

No. Use length == 0. O(1) length is always supported for all collections.

One thing before I forget: I think any good collection abstraction must be concretized back to the classic collection instances. Singly-linked lists definitely can't be left off the list! It would be an epic failure. Imagine the papers! New York Times: "D has containers, but no singly-linked lists". The New Yorker: "D's abstractions are too abstract". The National Enquirer: "The Bat Boy stole D's singly-linked lists". Pyongyang Times: "Another failure of the so-called Western Democracy -- yet Juche doesn't need singly-linked lists anyway." Keeping the length cached in a singly-linked list is a costly mistake. It works against splicing (an important list primitive) and most of the time you don't need it so why waste time updating it. Adding any cruft beyond { T payload; List* next; } is very strong incentive for the coder to write their own. Why do you think they're using an SLL in the first place? Because it's simple and has efficient primitives!
 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of
 dcollections?

I haven't the slightest what a bimap is :) I am not an expert in collections or data structures, I just reimplement things I have understood. My implementations are basically copied from my algorithm book, and refined as much as I can do. That being said, if you have any implementation of a tree or hash, it should be easy to insert into dcollections. If you have ideas for other collection types (i.e. other than Map, Set, Multiset or List), then I can look into that if you point me at an implementation or have one of your own. I purposefully left out multi-map because I've never had a huge use for it, and it seemed like a awkward thing to create an interface for...

Tries of various kinds come up again and again. I don't think dcollections' current abstractions capture them, which further supports my point that containers have too much personality to be caught in the straitjacket of class hierarchies. Andrei
May 22 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Keeping the length cached in a singly-linked list is a costly mistake.
 It works against splicing (an important list primitive) and most of the
 time you don't need it so why waste time updating it.

... LinkedList(T, bool WithLength=True) { static if(WithLength) { // defines attribute length here etc } } Inside the methods like the add there is another static if(WithLength) that enables/disables the code that updates the length. This allows to keep the length by default, but makes it easy to remove it when desired. Bye, bearophile
May 22 2010
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu Wrote:
 
 One thing before I forget: I think any good collection abstraction must
 be concretized back to the classic collection instances. Singly-linked
 lists definitely can't be left off the list! It would be an epic 
 failure. Imagine the papers! New York Times: "D has containers, but no 
 singly-linked lists".

We could always say we were following C++'s lead :-)
May 22 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/22/2010 09:07 PM, Sean Kelly wrote:
 Andrei Alexandrescu Wrote:
 One thing before I forget: I think any good collection abstraction must
 be concretized back to the classic collection instances. Singly-linked
 lists definitely can't be left off the list! It would be an epic
 failure. Imagine the papers! New York Times: "D has containers, but no
 singly-linked lists".

We could always say we were following C++'s lead :-)

C++(0|1)x has forward_list. Needless to say, supporting size() at constant complexity was a matter of huge debate. The proposal that I like is this: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2543.htm Search the page for "size()". I don't know how the proposal was amended. Andrei
May 22 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?
 but all dcollections define length.

I was hoping we're on the verge of agreeing to yank all interfaces and let the collection roam free. If that's a possibility, then your design isn't constrained to define length anymore unless in can.
 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it looks like an obviously better design.
 But all dcollections are bi-directional anyways, there is no singly
 linked list.  That was a decision I made early on, because it allows more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?
 And singly linked vs. doubly linked does not make any difference whether
 O(1) length is possible or not. As you say, it's O(1) splicing or O(1)
 length, regardless of single or double links.

 I disagree with your assessment that length is a less used operation
 than splicing. I think I have never used splicing with std::list. C++'s
 default is O(1) length, and I followed that design.

I didn't assess that. My main point was that if one defines an slist, in many cases one defines it to be very small, simple, and cheap. Maintaining size will come on the radar and would discourage the use of the abstraction in favor of a hand-rolled implementation. This is failure to abstract.
 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of
 dcollections?

I haven't the slightest what a bimap is :) I am not an expert in collections or data structures, I just reimplement things I have understood. My implementations are basically copied from my algorithm book, and refined as much as I can do. That being said, if you have any implementation of a tree or hash, it should be easy to insert into dcollections. If you have ideas for other collection types (i.e. other than Map, Set, Multiset or List), then I can look into that if you point me at an implementation or have one of your own. I purposefully left out multi-map because I've never had a huge use for it, and it seemed like a awkward thing to create an interface for...

Tries of various kinds come up again and again. I don't think dcollections' current abstractions capture them, which further supports my point that containers have too much personality to be caught in the straitjacket of class hierarchies.

I am not familiar with tries, and dcollections has no class hierarchy.

I don't know of a word for "hierarchy with interfaces as root and inner nodes and classes as leaf nodes" so I just call that class hierarchy. Andrei
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?

Yes, at that point it's an optimization. The only place where I assume length could be invalid is when working with the Iterator!V type, which might not be a dcollections object.

That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have interfaces, the door is ajar forever. Client code can't write this: auto total = cont1.length + cont2.length; because that is incorrect code (that compiles, runs, and produces some odd result). So don't take it lightly. NO_LENGTH_SUPPORT means no length support. Then pretending it's supported will only harm everyone.
 but all dcollections define length.

I was hoping we're on the verge of agreeing to yank all interfaces and let the collection roam free. If that's a possibility, then your design isn't constrained to define length anymore unless in can.

I agree, removing all the interfaces would get rid of the need for NO_LENGTH_SUPPORT. But all dcollections define a valid length, so that point is kinda moot :)

Not moot because you have interfaces.
 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it looks like an obviously better design.
 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?

I don't see them as unable to express slists. They would break the design guidelines that I set for collections being iterable in both directions, but an slist fits fine within the List interface.

What's the required complexity of concat? Is add expected to put things in a specific place? How does slist implement back()? slist does not fit the List interface.
 And singly linked vs. doubly linked does not make any difference whether
 O(1) length is possible or not. As you say, it's O(1) splicing or O(1)
 length, regardless of single or double links.

 I disagree with your assessment that length is a less used operation
 than splicing. I think I have never used splicing with std::list. C++'s
 default is O(1) length, and I followed that design.

I didn't assess that.

I felt like you did. Your statement was: "It works against splicing (an important list primitive) and most of the time you don't need it so why waste time updating it." While you didn't specifically say it was used more, you mentioned the importance of splicing, and the non-importance of length. I guess I should have said, I disagree that splicing is more important than caching length.
 My main point was that if one defines an slist, in many cases one
 defines it to be very small, simple, and cheap. Maintaining size will
 come on the radar and would discourage the use of the abstraction in
 favor of a hand-rolled implementation. This is failure to abstract.

All dcollections own their elements, it is not like an array, where there can be multiple references to subsets of the data. An slist would simply be an slist as you describe with a slightly bigger head. In other words, the head node has the length cache and allocator, and object monitor, and whatever else you need for the whole list. The nodes themselves would be a simple element and pointer. But the elements are never exposed except via ranges and cursors. The ranges and cursors cannot affect the topology of the list, you need the head "owner" node to do that.

I understand. The problem is when you many such lists or when you manipulate a few intensively.
 I look at it this way, dcollections are "good enough" for most uses, if
 you need highly tailored data structures with specific uses in mind, you
 can make those on your own.

 For example, dcollections' LinkList can easily take the place of a
 simple slist. It may not be as fast with your specific requirements in
 mind, but it works. The benefit is, it works like all the other
 collection types and it's easy to learn all of them together because
 they all have certain properties.

That's what STL said about slists. Next thing you know... forward_list is being standardized. Andrei
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 11:14 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:45:44 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?

Yes, at that point it's an optimization. The only place where I assume length could be invalid is when working with the Iterator!V type, which might not be a dcollections object.

That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have interfaces, the door is ajar forever. Client code can't write this: auto total = cont1.length + cont2.length; because that is incorrect code (that compiles, runs, and produces some odd result). So don't take it lightly. NO_LENGTH_SUPPORT means no length support. Then pretending it's supported will only harm everyone.

I get what you are saying. What I meant was it's moot if you are not using interfaces. If you don't know what the underlying type is, then you have to do a runtime check. I agree it's awkward, and I could have not included length as a member of Iterator, in which case it would be on all the container types and NO_LENGTH_SUPPORT would not exist. All containers are meant to have a valid length, it is only with non-container Iterators that length can be NO_LENGTH_SUPPORT.
 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it looks like an obviously better design.
 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows
 more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part
 of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?

I don't see them as unable to express slists. They would break the design guidelines that I set for collections being iterable in both directions, but an slist fits fine within the List interface.

What's the required complexity of concat?

O(n) with n being the number of elements added
 Is add expected to put things in a specific place? How does slist
 implement back()?

 slist does not fit the List interface.

A pointer to the end element would be required for both appending and back().

This further erodes my confidence. Size needs to be maintained _and_ a pointer to the last element must be maintained. For many uses, all of this stuff is unnecessary overhead. I think we need to build the shared vision that in Phobos we're competing against hand-written, highly-optimized code. This is the fight STL took and largely won. We can't afford to lower our standards for one tiny bit. I was once talking over a beer about the advantages of container abstractions. Walter slapped my face by defining the fastest SLL in the world on a napkin. It looked like this: struct List { int v; List * next; } He took so little time to define that, and the primitives he needed were so simple and fast, that it was an absolute overkill to replace that with a generic whiz-bang container. If I'd mentioned there'd be any extra data involved, that would mean the end of negotiation. "Why do you need that?" "For length()!" "But I don't need length()!" "Well you have to." "That's Communism in design!" And I agree. Andrei
May 24 2010
next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 05/24/2010 06:27 PM, Andrei Alexandrescu wrote:
 struct List { int v; List * next; }

While I do agree with that design for a list, that is no reference type. I thought we wanted reference types.
May 24 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 02:01 PM, Pelle wrote:
 On 05/24/2010 06:27 PM, Andrei Alexandrescu wrote:
 struct List { int v; List * next; }

While I do agree with that design for a list, that is no reference type.

List* is. My point was that the pressure for a really simple hand-rolled SLL is huge. A good abstraction should convince the Walters of this world that the abstraction is actually better than the hand-rolled SLL. Andrei
May 24 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:
 Look at D's arrays.  Is appending with D's arrays the fastest it possibly  
 could be?  Hell no, but it's good enough for most situations, and safe.

Append in D dynamic arrays is awful, it's slow and requires complex code (and currently there are not explanations about how it works in the D docs, see http://d.puremagic.com/issues/show_bug.cgi?id=4091 ), so it's not a good example to be copied :-) It's not fast enough for most situations, that's why there is the need of an Appender still. Bye, bearophile
May 24 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
And the appending is hard to use too, see the ridiculously complex to use &
messy capacity, reserve and  and assumeSafeAppend. So overall it's a good
example of what I will never want to copy.

Bye,
bearophile
May 24 2010
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 complex code?
 a ~= b;
 Seems pretty not complex.

I meant on the implementation side. Arrays are used all the time, and they are basic blocks to be used everywhere so a programmer probably must understand how they work inside.
 That part is still pretty new.  What do you find is messy and complex  
 about it?

The internal semantics of the current design of dynamic arrays is not transparent enough, this means I often don't know what it is doing. This can be acceptable for a more complex data structure as the associative arrays. And it's not easy to know exactly where to use assumeSafeAppend. Despite their complexity, D dynamic arrays are not safe enough, see the small thread about passing D dynamic arrays by reference on default.
 When I said "good enough for most situations" I didn't mean "most  
 situations where appending speed is crucial to the success of the  
 application" :P

I have written few hundreds of small D programs, and I have seen that appending often happens (more often that I have originally thought) in points where performance is important enough. Using the Appender of my dlibs was usually able to solve the problem. I don't know better solutions, so maybe I'm just wasting your time, I am sorry. Before seeing D I have never thought that designing good dynamic arrays can be so hard :-) Bye, bearophile
May 24 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 02:29 PM, bearophile wrote:
 Steven Schveighoffer:
 Look at D's arrays.  Is appending with D's arrays the fastest it possibly
 could be?  Hell no, but it's good enough for most situations, and safe.

Append in D dynamic arrays is awful, it's slow and requires complex code (and currently there are not explanations about how it works in the D docs, see http://d.puremagic.com/issues/show_bug.cgi?id=4091 ), so it's not a good example to be copied :-) It's not fast enough for most situations, that's why there is the need of an Appender still.

When was the last time you measured? I thought the speed has largely improved since Steve integrated his work. Andrei
May 24 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 When was the last time you measured? I thought the speed has largely 
 improved since Steve integrated his work.

I have timed it after that integration. I have seen a performance improvement, but it's small. I can perform some syntactic benchmarks. Later, bearophile
May 24 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 01:49 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 12:27:10 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 A pointer to the end element would be required for both appending and
 back().

This further erodes my confidence. Size needs to be maintained _and_ a pointer to the last element must be maintained. For many uses, all of this stuff is unnecessary overhead.

You make it sound like incrementing a counter and changing a pointer are incredibly slow operations :)

They are. Ask Walter for a good example.
 The overhead of storage is minimal
 compared to the elements of the list, which do not contain said overhead.

I'm talking containers of lists here.
 I think we need to build the shared vision that in Phobos we're
 competing against hand-written, highly-optimized code. This is the
 fight STL took and largely won. We can't afford to lower our standards
 for one tiny bit.

 I was once talking over a beer about the advantages of container
 abstractions. Walter slapped my face by defining the fastest SLL in
 the world on a napkin. It looked like this:

 struct List { int v; List * next; }

 He took so little time to define that, and the primitives he needed
 were so simple and fast, that it was an absolute overkill to replace
 that with a generic whiz-bang container. If I'd mentioned there'd be
 any extra data involved, that would mean the end of negotiation. "Why
 do you need that?" "For length()!" "But I don't need length()!" "Well
 you have to." "That's Communism in design!"

 And I agree.

But something like the above is prone to all sorts of nasty things, like inadvertent cycles in your list.

Yeah, and I'd have my list of arguments to add too.
 Which may be acceptable to you if you
 want the bleeding fastest speed available. There will always be tailored
 code crafted to be as fast as possible for some specific design and
 dcollections and your slist just won't fit the bill.

Except that the set is considerably smaller for a well-designed slist.
 And dcollections' link list node is exactly what you wrote there (with a
 prev pointer of course). The only difference is, I don't define an
 element is the same as a list. The whole list has it's own data,
 including a pointer to the first element in the list.

 Look at D's arrays. Is appending with D's arrays the fastest it possibly
 could be? Hell no, but it's good enough for most situations, and safe.

If I could make them faster without adversely affecting the performance of other operations, I would. There's considerable constraints at plain in the array design for historical and other reasons. Andrei
May 24 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

Missed that upon the first read. I suggest you look at tries and the following other structures as good examples that it's futile to fit collections into hierarchies. http://en.wikipedia.org/wiki/Trie http://linux.thai.net/~thep/datrie/datrie.html http://en.wikipedia.org/wiki/Suffix_tree http://en.wikipedia.org/wiki/Kd-tree We'd want to implement in time those and many more in Phobos without worrying that some of their primitives won't fit the existing interfaces, and also without the unjustified effort of formalizing interfaces for each of them in thinking that another very, very similar container will come along. Andrei
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 10:39 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:21:20 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

Missed that upon the first read. I suggest you look at tries and the following other structures as good examples that it's futile to fit collections into hierarchies. http://en.wikipedia.org/wiki/Trie http://linux.thai.net/~thep/datrie/datrie.html http://en.wikipedia.org/wiki/Suffix_tree http://en.wikipedia.org/wiki/Kd-tree We'd want to implement in time those and many more in Phobos without worrying that some of their primitives won't fit the existing interfaces, and also without the unjustified effort of formalizing interfaces for each of them in thinking that another very, very similar container will come along.

From a cursory look, I don't see why tries would not be possible in dcollections. I'd probably start with something like this: class TrieMap(K, V) : Map!(K[], V) The array brackets enforce the ability to use prefixes on the keys.

The key point of tries is that they have distributed storage. Thus, shoehorning them into the interface function Iterator!(K[]) keys(); forces allocation and copying. Andrei
May 24 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:
 Is it unreasonable to expect to be able  
 to iterate the keys in a trie?  (I don't really know, I've never worked  
 with them)

On tries you can iterate on all keys or on a sorted range of the keys. Bye, bearophile
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie?  (I don't really know, I've never worked
 with them)

On tries you can iterate on all keys or on a sorted range of the keys.

There's one more way of iteration that's unique to tries - by key element (e.g. if key is string, you iterate by character). That would be an exploratory iteration. At each step in the iteration you pass a character, so popFront() takes an argument, it's popFront(char). trie.popFront('a') takes you to all elements in the trie that start with 'a'. Then you can continue exploring by e.g. trie.popFront('x') which takes you to all elements that start with "ax". At any point the range may become dry (nothing with this prefix). If the range is not empty, front() gives you information about all strings that share the particular prefix you have spanned (count and breadth comes to mind). Andrei
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 01:12 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie? (I don't really know, I've never worked
 with them)

On tries you can iterate on all keys or on a sorted range of the keys.

There's one more way of iteration that's unique to tries - by key element (e.g. if key is string, you iterate by character). That would be an exploratory iteration. At each step in the iteration you pass a character, so popFront() takes an argument, it's popFront(char). trie.popFront('a') takes you to all elements in the trie that start with 'a'. Then you can continue exploring by e.g. trie.popFront('x') which takes you to all elements that start with "ax". At any point the range may become dry (nothing with this prefix). If the range is not empty, front() gives you information about all strings that share the particular prefix you have spanned (count and breadth comes to mind).

That could be supported.

To paraphrase a commercial, "there's an interface for that" :o).
 But iterating over the keys, is that not advisable? Would your trie
 iterator be the only way to iterate over the elements? It seems rather
 non-useful.

I think a trie, like many other nontrivial containers, supports several means of iteration.
 Or is there another way to iterating the keys that would be more
 efficient than building an array to contain the 'key' for each node?

I don't know. I'm just glad I can explore that without having to think in interfaces. Andrei
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 03:18 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 15:58:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 01:12 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie? (I don't really know, I've never
 worked
 with them)

On tries you can iterate on all keys or on a sorted range of the keys.

There's one more way of iteration that's unique to tries - by key element (e.g. if key is string, you iterate by character). That would be an exploratory iteration. At each step in the iteration you pass a character, so popFront() takes an argument, it's popFront(char). trie.popFront('a') takes you to all elements in the trie that start with 'a'. Then you can continue exploring by e.g. trie.popFront('x') which takes you to all elements that start with "ax". At any point the range may become dry (nothing with this prefix). If the range is not empty, front() gives you information about all strings that share the particular prefix you have spanned (count and breadth comes to mind).

That could be supported.

To paraphrase a commercial, "there's an interface for that" :o).

No there isn't. I wouldn't make that an interface, as it's not foreachable. I'd build a special range for it.
 But iterating over the keys, is that not advisable? Would your trie
 iterator be the only way to iterate over the elements? It seems rather
 non-useful.

I think a trie, like many other nontrivial containers, supports several means of iteration.

Including keys?!!! Still not a straight answer. If you should be able to iterate keys, then say you should be able to iterate keys. If you shouldn't, then say that.

Sorry. Yes, by-key iteration should be possible. Andrei
May 24 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

OK, so we should be able to iterate keys. And the keys are not stored in the trie structure itself. So how does one iterate the keys of the container without reconstructing them from the trie nodes using the heap?

You can't. At some point you need to copy tree traces into T[] arrays. If the trie stores parent nodes, you don't need to do that as you can expose a trace as a double-ended range containing the key. There's a kind of trie that was defined to avoid such issues; it stores the keys in clear, in the leaves, at the expense of duplication. I don't remember the name offhand. Does anyone? Andrei
May 24 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 04:38 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 17:35:11 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

OK, so we should be able to iterate keys. And the keys are not stored in the trie structure itself. So how does one iterate the keys of the container without reconstructing them from the trie nodes using the heap?

You can't. At some point you need to copy tree traces into T[] arrays. If the trie stores parent nodes, you don't need to do that as you can expose a trace as a double-ended range containing the key. There's a kind of trie that was defined to avoid such issues; it stores the keys in clear, in the leaves, at the expense of duplication. I don't remember the name offhand. Does anyone?

OK, so the keys function of Map should work just fine for a Trie implementation. Good to know.

This wins a little battle but not the war. Long term you're facing the sysyphic job of either knocking new containers into the existing interfaces, or extending the interfaces to accommodate new containers. It doesn't look to me like a winning proposition. The FIC (Federation of Independent Containers) does not have that problem. It does have its specifications and guidelines but whichever container doesn't support some or even all of the required methods can simply define its own under other names. Then users can play with the containers and with the ranges they define as they find fit. Andrei
May 24 2010
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
Sheer awesomeness!  I've been waiting for along time for a decent collections
lib
for D2.  I haven't tested it extensively, but it definitely builds w/o a hitch
(using a custom build script I wrote that builds dcollections and other libs I
use
in one go) and works for the toy examples I tested.

One small comment, though:  Looking at your ArrayList code, wouldn't it make
more
sense to use a capacity field here than to use the builtin array appending
mechanism, given that this is a class, not a struct?
May 19 2010
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
dsimcha Wrote:

 One small comment, though:  Looking at your ArrayList code, wouldn't it make
more
 sense to use a capacity field here than to use the builtin array appending
 mechanism, given that this is a class, not a struct?

Yes, that is part of my semi-documented long term plan. I actually am planning on adding an "extendLength" runtime function to support extending the length of an array as much as possible without reallocating. That way I can make ArrayList independent of the allocation sizes that the GC supports. -Steve
May 19 2010
prev sibling next sibling parent Graham Fawcett <fawcett uwindsor.ca> writes:
Hi Steven,

On Wed, 19 May 2010 12:09:11 -0400, Steven Schveighoffer wrote:

 After much work and toil, I have created dcollections for D2.  I think I
 can say that this is the first collection package available for D2.  I
 still hold hope that it will be used as the standard collection package
 for phobos for D2.

Cool! In case anyone's interested, I tried out dcollections using my very alpha 'd-build' script, and it works just fine: http://github.com/gmfawcett/d-build/tree/dcollections-test tarball of the dcollections test branch: http://github.com/gmfawcett/d-build/tarball/dcollections-test Checking out the 'dcollections-test' branch of that git project, you should be able to just run './d-build test.d' and it will first download dcollections and then compile it along with the test program. There are several limitations; see the README for details. There's a long way to go before d-build is more than a toy. But I'd like to keep it on people's radars, and am interested in your thoughts and feedback. See the "envy for go packages" threads on this list for context. Regards, Graham
 
 Please visit http://www.dsource.org/projects/dcollections for details.
 
 Some VERY IMPORTANT notes for this release:
 
 1. it is very beta software.  Although the implementation has changed
 very little from the 1.0 version, not everything will properly compile. 
 I have filed several compiler bugs to fix some problems I had while
 porting, and inserted workarounds for things I could work around. 
 Please use the LATEST compiler (currently 2.046) because some very
 important features have been added that dcollections relies on. 
 Although it is beta, the algorithms and implementation is largely
 unchanged so functionality should be pretty solid.
 
 2. the docs are not online for d2.  Unfortunately, dsource's
 auto-doc-generator that I was using for the 1.0 version is based on a
 1.0 compiler, so it will not automatically generate the d2 docs. 
 However, I have included some basic ddoc generated docs in the download
 package, don't expect them to be very fun to use :)
 
 3. The docs are not fully updated, so they might be just flat out wrong!
 I plan to update the docs completely for the next beta release.
 
 So, for the good stuff... here are some notes for differences between
 1.0 and 2.0:
 
      * Ranges!
      * Filled out slicing for all containers * Cursors are now
      non-movable.  Use ranges for safe iteration. * The interfaces have
      been cut down significantly. The question I
 asked myself when deciding whether I wanted to keep an interface is
 "would anyone use this interface?"
      * Functions that should be fast but can be slow (O(n)) have been
 removed from all interfaces, and in most cases, from the containers.
 Notably missing is searching a non-quick lookup container for a value.
 Use std.algorithm.find for that.
      * ArrayMultiset has been removed -- it's complexity functions did
      not
 fit with the multiset requirements (specifically, quick lookup of an
 element's presence).
      * ArrayList slicing now simply returns a range instead of a "live"
 slice.  Note that an ArrayList range is aliased to a D array.
      * Full unit tests added for all container types.
 
 I have also changed the license from BSD to Boost.
 
 
 --------
 
 I have also formally released version 0.03 as version 1.0, nothing has
 changed there, including the license.  However, no new features will be
 added to dcollections version 1.0, only bug fixes will be done.
 
 Please, file tickets on dsource for any problems you have.  And let me
 know if there are any ideas you think would be useful.
 
 -Steve

May 19 2010
prev sibling next sibling parent Graham Fawcett <fawcett uwindsor.ca> writes:
On Wed, 19 May 2010 20:27:17 +0000, Graham Fawcett wrote:

 Hi Steven,
 
 On Wed, 19 May 2010 12:09:11 -0400, Steven Schveighoffer wrote:
 
 After much work and toil, I have created dcollections for D2.  I think
 I can say that this is the first collection package available for D2. 
 I still hold hope that it will be used as the standard collection
 package for phobos for D2.

Cool! In case anyone's interested, I tried out dcollections using my very alpha 'd-build' script, and it works just fine: http://github.com/gmfawcett/d-build/tree/dcollections-test tarball of the dcollections test branch: http://github.com/gmfawcett/d-build/tarball/dcollections-test Checking out the 'dcollections-test' branch of that git project, you should be able to just run './d-build test.d' and it will first download dcollections and then compile it along with the test program. There are several limitations; see the README for details. There's a long way to go before d-build is more than a toy. But I'd like to keep it on people's radars, and am interested in your thoughts and feedback. See the "envy for go packages" threads on this list for context.

Apologies; the "envy for go packages" thread is on the D list, not on D.announce. Graham
May 19 2010
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Steven Schveighoffer Wrote:

 After much work and toil, I have created dcollections for D2.  I think I  
 can say that this is the first collection package available for D2.  I  
 still hold hope that it will be used as the standard collection package  
 for phobos for D2.
 
 Please visit http://www.dsource.org/projects/dcollections for details.
 
 Some VERY IMPORTANT notes for this release:
 
 1. it is very beta software.  Although the implementation has changed very  
 little from the 1.0 version, not everything will properly compile.  I have  
 filed several compiler bugs to fix some problems I had while porting, and  
 inserted workarounds for things I could work around.  Please use the  
 LATEST compiler (currently 2.046) because some very important features  
 have been added that dcollections relies on.  Although it is beta, the  
 algorithms and implementation is largely unchanged so functionality should  
 be pretty solid.
 
 2. the docs are not online for d2.  Unfortunately, dsource's  
 auto-doc-generator that I was using for the 1.0 version is based on a 1.0  
 compiler, so it will not automatically generate the d2 docs.  However, I  
 have included some basic ddoc generated docs in the download package,  
 don't expect them to be very fun to use :)
 
 3. The docs are not fully updated, so they might be just flat out wrong!   
 I plan to update the docs completely for the next beta release.
 
 So, for the good stuff... here are some notes for differences between 1.0  
 and 2.0:
 
      * Ranges!
      * Filled out slicing for all containers
      * Cursors are now non-movable.  Use ranges for safe iteration.
      * The interfaces have been cut down significantly. The question I  
 asked myself when deciding whether I wanted to keep an interface is "would  
 anyone use this interface?"
      * Functions that should be fast but can be slow (O(n)) have been  
 removed from all interfaces, and in most cases, from the containers.  
 Notably missing is searching a non-quick lookup container for a value. Use  
 std.algorithm.find for that.
      * ArrayMultiset has been removed -- it's complexity functions did not  
 fit with the multiset requirements (specifically, quick lookup of an  
 element's presence).
      * ArrayList slicing now simply returns a range instead of a "live"  
 slice.  Note that an ArrayList range is aliased to a D array.
      * Full unit tests added for all container types.
 
 I have also changed the license from BSD to Boost.
 
 
 --------
 
 I have also formally released version 0.03 as version 1.0, nothing has  
 changed there, including the license.  However, no new features will be  
 added to dcollections version 1.0, only bug fixes will be done.
 
 Please, file tickets on dsource for any problems you have.  And let me  
 know if there are any ideas you think would be useful.
 
 -Steve

Well as long as you're here can I submit an error here? :) I get an error while building the D2 branch: Error: ArrayMultiset.obj : No such file or directory It seems the ArrayMultiset.d is not in the dcollections folder for the D2.x branch, although it is in trunk (I guess the trunk is the D1.x one?). Is that module deprecated for d2.x? (although it's listed in the win32 batch file)
May 19 2010
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
Andrej Mitrovic Wrote:


 Well as long as you're here can I submit an error here? :)
 
 I get an error while building the D2 branch:
 
 Error: ArrayMultiset.obj : No such file or directory

Crud, I admit that I assumed anything that built on Linux would build on Windows. I still believe it will, but of course, I need to change the batch file :)
 
 It seems the ArrayMultiset.d is not in the dcollections folder for the D2.x 
branch, although it is in trunk (I guess the trunk is the D1.x one?). 

Yes, D1 is trunk, D2 is going to eventually be trunk
 
 Is that module deprecated for d2.x? (although it's listed in the win32 batch
file)

Yes, just remove it. I'll fix it when I get a chance. -Steve
May 19 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 11:09 AM, Steven Schveighoffer wrote:
 After much work and toil, I have created dcollections for D2.

Awesome! I'm looking through it, unfortunately after wandering in the trunk for a while (I was like, wait, what?). But that was after all good because I saw a lot of awesome improvements in the new library. This is solid work. I salute your decision to frame complexity as a feature and remove functions with "undecided complexity". This is huge. That being said, I think dcollections2 has not yet solved a number of problems, some minor, some annoying, and some fundamental. This makes things quite unpleasant because, willingly or not, I'm in a three-way conflict of interest: (1) I can influence to some extent the decision of adopting dcollections2 for phobos; (2) I have my own design in mind which is competing with dcollections2; (3) my time is scarce so I'm having difficulty executing on that design, whereas dcollections2 is already usable. I'm not sure where this leaves us. I'm tempted to post a list of "grievances" with the current design, but it's difficult to make that sound sincere due to the conflict of interest. Let me start by asking this: on a scale of 0 ("no change") to 10 ("redesign the whole thing"), where do you stand regarding the perspective of operating changes to dcollections2? Andrei
May 19 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Andrei Alexandrescu Wrote:

 On 05/19/2010 11:09 AM, Steven Schveighoffer wrote:
 After much work and toil, I have created dcollections for D2.

Awesome! I'm looking through it, unfortunately after wandering in the trunk for a while (I was like, wait, what?). But that was after all good because I saw a lot of awesome improvements in the new library. This is solid work.

Yeah, I will move the D1 version to a branch and copy the D2 to trunk when I get a chance, Part of the reason is the automatic doc generator will puke if the trunk is D2.
 That being said, I think dcollections2 has not yet solved a number of 
 problems, some minor, some annoying, and some fundamental. This makes 
 things quite unpleasant because, willingly or not, I'm in a three-way 
 conflict of interest: (1) I can influence to some extent the decision of 
 adopting dcollections2 for phobos; (2) I have my own design in mind 
 which is competing with dcollections2; (3) my time is scarce so I'm 
 having difficulty executing on that design, whereas dcollections2 is 
 already usable.
 
 I'm not sure where this leaves us. I'm tempted to post a list of 
 "grievances" with the current design, but it's difficult to make that 
 sound sincere due to the conflict of interest. Let me start by asking 
 this: on a scale of 0 ("no change") to 10 ("redesign the whole thing"), 
 where do you stand regarding the perspective of operating changes to 
 dcollections2?

It depends on what you want to change. If you say cursors have to go, then I would say no. If you think some functions/classes need to be renamed, or the module structure needs to be changed, I'd say fine. There are other design decisions that can be changed, but I look at it from the perspective of usefulness to me :) If it stops being useful to me, or does not provide a feature I want, then I'll not use it, and then I won't want to maintain it. I guess start with the big ones, 'cause minor ones aren't likely to be a problem. -Steve
May 19 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 04:37 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:
 I'm not sure where this leaves us. I'm tempted to post a list of
 "grievances" with the current design, but it's difficult to make
 Let me start by
 asking this: on a scale of 0 ("no change") to 10 ("redesign the
 whole thing"), where do you stand regarding the perspective of
 operating changes to dcollections2?

It depends on what you want to change. If you say cursors have to go, then I would say no. If you think some functions/classes need to be renamed, or the module structure needs to be changed, I'd say fine. There are other design decisions that can be changed, but I look at it from the perspective of usefulness to me :) If it stops being useful to me, or does not provide a feature I want, then I'll not use it, and then I won't want to maintain it. I guess start with the big ones, 'cause minor ones aren't likely to be a problem.

Well the thing is I'm strongly questioning the whole point of defining a hierarchy for containers, in particular when the interfaces are numerous. I'd go as heretical as saying "Container hierarchy are for languages that suck." Also, the fact that (for efficiency reasons) ranges are concrete and inaccessible from the abstract interfaces aggravates the matter even more, to the point where containers are neither horses nor mules: if you use the interfaces you have only severely emasculated access to the container's elements, and if you use the concrete classes why the heck bother with hierarchies in the first place. Whew. Let me explain; there's a lot to explain. Right now there are 10 interfaces in 7 files in the /model/ subdir. It happens quite often that a given class inherits more than one interface, for example ArrayList inherits two, and many set types inherit Set which in turn inherits Addable!V, Iterator!V, Purgeable!V. The problem with this setup that encodes capabilities in interfaces is that many sensible combinations of capabilities are impossible to obtain. Say you want to define a function messWith(C) where C is Purgeable and Addable. There's no type for that! Set has too many capabilities (which not all containers might support), so you'll need to do a runtime check: void messWith(Addable!int ia) { auto ip = cast(Purgeable!int) ia; enforce(ip, "Wrong type etc."); ... } It would be more expressive if such capability constraints could be expressed during compilation. To see more about this problem, you may want to check "Compound types for Java" by Buechi and Wech (just google it). The paper explains the problem in detail and proposes a Java language change to fix it. Java didn't change and was not able to devise a library solution. I wrote a solution to the problem in native D. It goes like this: alias Container!(int, addable | purgeable) Messerschmidt; void messWith(Messerschmidt i) { ... use i's capabilities to add and purge ... } The inspiration for this approach was given by Scott Meyers' article "Red code, green code". The approach does work and rather beautifully, but it's impractical: the class hierarchy forms a dense DAG and if you add 4-5 capabilities an empty object already has size 8KB consisting only of routing vtables. Escalating this discussion one step further (and probably a bit outside the area of commonly-accepted lore), let me hypothesize that maybe the entire idea of container hierarchies is a red herring, the hoax of the 1990s. Hierarchies are for types that have a lot of commonality and a little variability. Containers are not that. They have personality, refuse to accept straitjacket interfaces, and each has unique capabilities - as your capability-based design witnesses. I agree that reuse via binary interfaces is good when you write functions that exploit them, but in D it's simple and cheap enough to write a concept-checked generic function to get around that. In my designs I either know what container I'm dealing with, or I am in generic-land. I never was like, "mmm, great, I can change this container type at runtime". But wait, there's less. Furthermore, container hierarchies encourage design by inheritance, which is more often than not poor. If you want to define a container that notifies an observer whenever you add stuff to it, composition is the best to follow. You don't want a horse with a violing grafted on its back; keep the horse a horse and play the violin and it'll dance. I've never, ever been in a place in life when I thought, "I should derive from this container class and hook a method of it." Composition always wins. To get back to one of my earlier points, the fact that the container interfaces are unable to express iteration is a corollary of the design's problem and an climactic disharmony. My vision, in very brief, is to foster a federation of independent containers abiding to identical names for similar functionality. Then a few concept checks (a la std.range checks) can easily express what capabilities a given client function needs from a container. Destroy me :o). Andrei
May 19 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 Destroy me :o).

You ideas are surely interesting, but I don't think there's a simple way to change his code according to your ideas. People can use the dcollections in the following weeks and months, and when you have implemented your ideas the people that like them can switch to using your collections. Bye, bearophile
May 19 2010
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
bearophile Wrote:

 Andrei Alexandrescu:
 Destroy me :o).

You ideas are surely interesting, but I don't think there's a simple way to change his code according to your ideas.

I don't think the ideas are mutually exclusive. I don't see why having an interface prevents someone from using a concept on the concrete class.
 People can use the dcollections in the following weeks and months, and when
you have implemented your ideas the people that like them can switch to using
your collections.

If it's at all possible, I'd like to cooperate on making dcollections phobos-qualified. So I'm glad to try and find a way to satisfy Andrei's concerns. -Steve
May 19 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:
 If it's at all possible, I'd like to cooperate on making dcollections
phobos-qualified.  So I'm glad to try and find a way to satisfy Andrei's
concerns.

Very good. Sometimes I am a too much pessimistic person :-) Bye, bearophile
May 19 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 08:53 PM, Steven Schveighoffer wrote:
 bearophile Wrote:

 Andrei Alexandrescu:
 Destroy me :o).

You ideas are surely interesting, but I don't think there's a simple way to change his code according to your ideas.

I don't think the ideas are mutually exclusive. I don't see why having an interface prevents someone from using a concept on the concrete class.
 People can use the dcollections in the following weeks and months,
 and when you have implemented your ideas the people that like them
 can switch to using your collections.

If it's at all possible, I'd like to cooperate on making dcollections phobos-qualified. So I'm glad to try and find a way to satisfy Andrei's concerns.

I'm very grateful to hear that; as I said, dcollections embody a great deal of solid work for which I have all admiration. If we disagree, I hope we won't make this a willpower struggle :o). Let the best ideas win. That being said, how something is said matters even if the said is true; I'll be extra careful with that. Andrei
May 21 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 07:21 PM, bearophile wrote:
 Andrei Alexandrescu:
 Destroy me :o).

You ideas are surely interesting, but I don't think there's a simple way to change his code according to your ideas. People can use the dcollections in the following weeks and months, and when you have implemented your ideas the people that like them can switch to using your collections. Bye, bearophile

I am sorry, but I've been quite sick since yesterday afternoon. I still can't think straight because of fever, but once it subsides, I'll reply to some of the points made in this thread. Andrei
May 20 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 I am sorry, but I've been quite sick since yesterday afternoon.

Get better and in the meantime don't worry for D. Bye, bearophile
May 20 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 07:21 PM, bearophile wrote:
 Andrei Alexandrescu:
 Destroy me :o).

You ideas are surely interesting, but I don't think there's a simple way to change his code according to your ideas. People can use the dcollections in the following weeks and months, and when you have implemented your ideas the people that like them can switch to using your collections. Bye, bearophile

I actually believe there is a simple transition path. In essence the interfaces in model/ should be rewritten as isXxx tests and the containers should have all of their methods final. Andrei
May 21 2010
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the container 
 interfaces are unable to express iteration is a corollary of the 
 design's problem and an climactic disharmony.
 
 My vision, in very brief, is to foster a federation of independent 
 containers abiding to identical names for similar functionality. Then a 
 few concept checks (a la std.range checks) can easily express what 
 capabilities a given client function needs from a container.

This might have a simple answer. Dcollections implementations are not a hierarchy, just the interfaces are. I.e. there aren't many kinds of HashMaps that derive from each other. But the interfaces are not detrimental to your ideas. The only thing interfaces require is that the entities implementing them are classes and not structs. As long as you agree that classes are the right call, then interfaces can co-exist with your other suggestions without interference. Yes, if you want to define "this function needs something that is both addable and purgeable, I don't have an interface for that. But a concept can certainly define that generically (which is what you want anyways), or you could just say "I need a List" and get those functions also. It also does not force entities other than dcollections objects to be classes, they could be structs and implement the correct concepts. I myself don't really use the interface aspect of the classes, it is mostly a carryover from the Java/Tango inspirations. But I can see one good reason to keep them -- binary interoperability. For example, it might be the case some day when D has good support with dynamic libraries that a library exposes some piece of itself as a Map or List interface. So my answer is -- go ahead and define these concepts and required names, and you can ignore the interfaces if they don't interest you. They do not subtract from the possibilities, and others may find good use for them. Does that make sense? -Steve
May 19 2010
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer 
 Does that make sense?

 -Steve

Yes and No. I understand where your coming from, but I think it's a bad idea. First, I think it needlessly expands the radius of comprehension needed to understand and use the library. (See Tangled up in tools http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I think designing a library to be flexible enough to meet some future, anticipated need (e.g. dlls) is a good idea, but actually implementing vaporous future needs is fraught with peril; it's too easy to guess wrong. Third, interface base design is viral; If library X uses interfaces then I have to use interfaces to interface with it. And if another library Y uses classes, then I'm going have to write a (needless) wrapper around one of them.

I understand these points, but I'm already using interfaces to copy data between containers. I don't have to, I could have used generic code, but this way, only one function is instantiated to copy data from all the other containers. The problem with using generic code is that the compiler will needlessly duplicate functions that are identical. Using interfaces is not as viral as you think. My interfaces can be used in generic code, as long as the generic code uses functions in the interfaces. If a library returns an interface, the author is saying "I don't want you using any functions outside this interface," so why is that a bad thing? Forcing people to *not* use interfaces has its drawbacks too. Dcollections gives the most flexible design I could muster, while still being useful. I'm not saying I'm against removing the interfaces until some later date, but I don't see any convincing arguments yet, especially since I've already seen benefits from having them. -Steve
May 20 2010
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:

 I understand these points, but I'm already using interfaces to copy 
 data between containers.  I don't have to, I could have used generic 
 code, but this way, only one function is instantiated to copy data from 
 all the other containers.  The problem with using generic code is that 
 the compiler will needlessly duplicate functions that are identical.

One question. Have you calculated the speed difference between using an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 20 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Michel Fortin:

 Surely going through all those virtual calls slows things down a lot.<

Right. But the purpose of a good compiler is to kill those, devirtualizing. LLVM devs are working on this too. See: http://llvm.org/bugs/show_bug.cgi?id=6054 http://llvm.org/bugs/show_bug.cgi?id=3100 --------------------- Steven Schveighoffer:
The problem with using generic code is that the compiler will needlessly
duplicate functions that are identical.<

See the -mergefunc compiler switch of LDC, to merge identical functions (like ones produced by template instantiations). This feature is not very powerful yet, but it's present and I have seen it works. See: http://llvm.org/bugs/show_bug.cgi?id=6712 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108275 Bye, bearophile
May 20 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-20 08:30:59 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:
 
 Surely going through all those virtual calls slows things down a lot.<

Right. But the purpose of a good compiler is to kill those, devirtualizing. LLVM devs are working on this too. See: http://llvm.org/bugs/show_bug.cgi?id=6054 http://llvm.org/bugs/show_bug.cgi?id=3100

Devirtualization is only possible in certain cases: when the function knows exactly which type it'll get. But downcasting to a more generic type and passing it around function calls strips it of this precise information required for devirtualization. The only way to propagate the exact type is to either instanciate a new version of the function you call for that specific type (which is what a template does) or inline it (because it also creates a new instance of the function, inline inside the caller). For instance: void test1(List list) { list.clear(); // can't devirtualize since we do now know which kind of list we'll get } void test2() { List list = new ArrayList; list.clear(); // now the compiler can see we'll always have an ArrayList, can devritualize } void test3(L)(L list) { list.clear(); // parent's type is propagated, can devirtualize if parent can. }
 Steven Schveighoffer:
 
 The problem with using generic code is that the compiler will 
 needlessly duplicate functions that are identical.<

See the -mergefunc compiler switch of LDC, to merge identical functions (like ones produced by template instantiations). This feature is not very powerful yet, but it's present and I have seen it works.

Indeed. I'm no expert in linkers, but in my opinion this is one of the most basic optimizations a linker should perform. And C++ has pushed linkers to do that for years now so I'd expect most linkers to do that already... The problem with templates are more the multiple slightly different instanciations. In general this is good for performance, but it's only needed for the code paths that need to be fast. I think generic containers should be fast. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 20 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Michel Fortin:

Devirtualization is only possible in certain cases: when the function knows
exactly which type it'll get.<

You are wrong, in most cases there are ways to de-virtualize, even when the runtime type isn't exactly known, but sometimes to do it you have to work too much. This is probably why C# dotnet doesn't perform this optimization. It's a complex topic, I suggest you to read about it, I can't explain here, see polymorpic call points, megamorphic ones, dispatch trees, and so on. You can read something here too: http://www.digitalmars.com/webnews/newsgroups/newsgroups.php?art_group=digitalmars.D&article_id=105630 LLVM is not able to do this yet well, but once the -O7 optimization level is available it will have the means (if not the capability) to devirtualize as well as HotSpot: http://linux.die.net/man/1/llvmc And even if you don't compile your code with -O7 there are ways to improve the situation anyway with static code analysis (but usually this is not as good as type information collected at runtime or during profiling for profile-based optimization).
Indeed. I'm no expert in linkers, but in my opinion this is one of the most
basic optimizations a linker should perform.<

At its basic level it's an easy optimization, but as usual if you want to implement something well, things gets harder :-) In LDC it's the compiler that is performing this optimization, and only if you manually add the -mergefunc switch, that is not used even in -O3.
The problem with templates are more the multiple slightly different
instanciations.<

This is what I was talking about with things getting "harder". A good implementation of this feature can recognize slightly different functions too, and remove such partial redundancy. This is doable, but LLVM is not able to do this yet. The good thing is that LLVM is improving quickly still.
In general this is good for performance, but it's only needed for the code
paths that need to be fast. I think generic containers should be fast.<

I am not sure what you mean here, but you can be interested in this thread started by me: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108136 Bye, bearophile
May 20 2010
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-20 15:47:22 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:
 
 Devirtualization is only possible in certain cases: when the function 
 knows exactly which type it'll get.<

You are wrong, in most cases there are ways to de-virtualize, even when the runtime type isn't exactly known, but sometimes to do it you have to work too much. This is probably why C# dotnet doesn't perform this optimization. It's a complex topic, I suggest you to read about it, I can't explain here, see polymorpic call points, megamorphic ones, dispatch trees, and so on.

I know I simplified a bit, and I'll admit you may know more than me about dynamic dispatch optimizations. But if I'm not mistaken other devirtualizing solutions are creating multiple instances of the code path based on either static info or a runtime switch. All this isn't very different from calling a templated function, but it's more complex and less reliable (because those optimizations might not be in the compiler, or might not be applicable to all scenarios). My point all along was that it's better to stick with templates, where you're "guarantied" to get the "optimal" code path. The downside is that you lose the dynamic dispatch capabilities. But I believe those are rarely needed in most programs. And if you really need it it's quite easy and efficient to layer dynamic dispatch over static calls (much more than the reverse). -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 20 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/20/2010 02:47 PM, bearophile wrote:
 Michel Fortin:

 Devirtualization is only possible in certain cases: when the
 function knows exactly which type it'll get.<

You are wrong, in most cases there are ways to de-virtualize, even when the runtime type isn't exactly known, but sometimes to do it you have to work too much. This is probably why C# dotnet doesn't perform this optimization.

If we get deeper into this branch, we'll forget where we started from (are interfaces sensible for this design?) and we'll reframe the interfaces vs. no interfaces decision into a speed loss vs. no speed loss decision. There are other arguments to look at besides performance. Andrei
May 21 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/20/2010 09:41 AM, Michel Fortin wrote:
 On 2010-05-20 08:30:59 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:

 Surely going through all those virtual calls slows things down a lot.<

Right. But the purpose of a good compiler is to kill those, devirtualizing. LLVM devs are working on this too. See: http://llvm.org/bugs/show_bug.cgi?id=6054 http://llvm.org/bugs/show_bug.cgi?id=3100

Devirtualization is only possible in certain cases: when the function knows exactly which type it'll get. But downcasting to a more generic type and passing it around function calls strips it of this precise information required for devirtualization.

(I assume you meant upcasting.) Even then, it's possible to accelerate calls by doing method casing, type casing, class hierarchy analysis...
 The only way to propagate the
 exact type is to either instanciate a new version of the function you
 call for that specific type (which is what a template does) or inline it
 (because it also creates a new instance of the function, inline inside
 the caller).

 For instance:

 void test1(List list) {
 list.clear(); // can't devirtualize since we do now know which kind of
 list we'll get
 }

 void test2() {
 List list = new ArrayList;
 list.clear(); // now the compiler can see we'll always have an
 ArrayList, can devritualize
 }

 void test3(L)(L list) {
 list.clear(); // parent's type is propagated, can devirtualize if parent
 can.
 }


 Steven Schveighoffer:

 The problem with using generic code is that the compiler will
 needlessly duplicate functions that are identical.<

See the -mergefunc compiler switch of LDC, to merge identical functions (like ones produced by template instantiations). This feature is not very powerful yet, but it's present and I have seen it works.

Indeed. I'm no expert in linkers, but in my opinion this is one of the most basic optimizations a linker should perform. And C++ has pushed linkers to do that for years now so I'd expect most linkers to do that already... The problem with templates are more the multiple slightly different instanciations. In general this is good for performance, but it's only needed for the code paths that need to be fast. I think generic containers should be fast.

There's been talk that Walter's slow porting of the linker to C and then D will put him in the position to introduce such an optimization. Andrei
May 21 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 See the -mergefunc compiler switch of LDC, to merge identical
 functions (like ones produced by template instantiations). This
 feature is not very powerful yet, but it's present and I have seen it
 works.

Indeed. I'm no expert in linkers, but in my opinion this is one of the most basic optimizations a linker should perform. And C++ has pushed linkers to do that for years now so I'd expect most linkers to do that already... The problem with templates are more the multiple slightly different instanciations. In general this is good for performance, but it's only needed for the code paths that need to be fast. I think generic containers should be fast.

There's been talk that Walter's slow porting of the linker to C and then D will put him in the position to introduce such an optimization.

Unfortunately, it's an agonizingly slow process. It also does nothing for Linux or OSX.
May 21 2010
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Michel Fortin Wrote:

 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 I understand these points, but I'm already using interfaces to copy 
 data between containers.  I don't have to, I could have used generic 
 code, but this way, only one function is instantiated to copy data from 
 all the other containers.  The problem with using generic code is that 
 the compiler will needlessly duplicate functions that are identical.

One question. Have you calculated the speed difference between using an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library.

It's not that much slower. You get a much higher speedup when things can be inlined than virtual vs. non-virtual. However, I should probably make all the functions in the concrete implementations final. I made several of them final, but I should do it across the board. One thing I just thought of -- in dcollections, similar types can be compared to one another. For example, you can check to see if a HashSet is equal to a TreeSet. But that would not be possible without interfaces. -Steve
May 20 2010
next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 05/20/2010 03:22 PM, Steven Schveighoffer wrote:
 One thing I just thought of -- in dcollections, similar types can be compared
to one another.  For example, you can check to see if a HashSet is equal to a
TreeSet.  But that would not be possible without interfaces.

 -Steve

I'm sorry, but I think that's a misfeature. In my opinion, a tree is not equal to a hash table, ever. You could compare the content separately, but that wouldn't require interfaces.
May 20 2010
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
Pelle Wrote:

 On 05/20/2010 03:22 PM, Steven Schveighoffer wrote:
 One thing I just thought of -- in dcollections, similar types can be compared
to one another.  For example, you can check to see if a HashSet is equal to a
TreeSet.  But that would not be possible without interfaces.

 -Steve

I'm sorry, but I think that's a misfeature. In my opinion, a tree is not equal to a hash table, ever.

They aren't trees and hashtables, they are sets. The "quick lookup" feature is implemented via a hashtable or tree. Basically, a set is defined by the exact elements in it, regardless of order. Comparing two sets for equality should always be possible. -Steve
May 20 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/20/2010 09:14 AM, Pelle wrote:
 On 05/20/2010 03:22 PM, Steven Schveighoffer wrote:
 One thing I just thought of -- in dcollections, similar types can be
 compared to one another. For example, you can check to see if a
 HashSet is equal to a TreeSet. But that would not be possible without
 interfaces.

 -Steve

I'm sorry, but I think that's a misfeature. In my opinion, a tree is not equal to a hash table, ever.

Yes. By the way, TDPL's dogma imposes that a == b for classes means they have the same dynamic type. Andrei
May 21 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-20 09:22:27 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:

 Michel Fortin Wrote:
 
 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data from
 all the other containers.  The problem with using generic code is that
 the compiler will needlessly duplicate functions that are identical.

One question. Have you calculated the speed difference between using an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library.

It's not that much slower. You get a much higher speedup when things can be inlined than virtual vs. non-virtual.

Yes, but that was part of the equation: a call to a template function can be inlined, not a virtual call (most of the time).
 However, I should probably make all the functions in the concrete 
 implementations final.  I made several of them final, but I should do 
 it across the board.

Yeah, probably.
 One thing I just thought of -- in dcollections, similar types can be 
 compared to one another.  For example, you can check to see if a 
 HashSet is equal to a TreeSet.  But that would not be possible without 
 interfaces.

I'm not sure of what you're saying here. Are you saying it can be done with an interface but not with a template type? Why can't this work: class TreeSet { bool opEquals(T)(T t) if (IsSet!T) { if (t.length != this.length) return false; foreach (element; this) { if (!t.has(element)) return false; } return true; } } TreeSet a = new TreeSet; HashSet b = new HashSet; assert(a == b); (sorry, I still have to start using the new operator overloading syntax.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 20 2010
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
Michel Fortin Wrote:

 On 2010-05-20 09:22:27 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 Michel Fortin Wrote:
 
 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data from
 all the other containers.  The problem with using generic code is that
 the compiler will needlessly duplicate functions that are identical.

One question. Have you calculated the speed difference between using an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library.

It's not that much slower. You get a much higher speedup when things can be inlined than virtual vs. non-virtual.

Yes, but that was part of the equation: a call to a template function can be inlined, not a virtual call (most of the time).

So if you want inlining, use the actual type, nothing is stopping you. Well, except the non-final functions, I have to fix that.
 One thing I just thought of -- in dcollections, similar types can be 
 compared to one another.  For example, you can check to see if a 
 HashSet is equal to a TreeSet.  But that would not be possible without 
 interfaces.

I'm not sure of what you're saying here. Are you saying it can be done with an interface but not with a template type? Why can't this work:

Because comparing two objects for equality now calls this function: bool opEquals(Object obj1, Object obj2) Which is defined in object_.d in the runtime. No compile-time anything is allowed. -Steve
May 20 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

You get a much higher speedup when things can be inlined than virtual vs.
non-virtual.<

In current D compilers virtual functions prevent inlining. This can give some performance loss. Bye, bearophile
May 20 2010
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
bearophile Wrote:

 Steven Schveighoffer:
 
You get a much higher speedup when things can be inlined than virtual vs.
non-virtual.<

In current D compilers virtual functions prevent inlining. This can give some performance loss.

People seem to think that because dcollections implement interfaces, you must never store a concrete collection instance. This isn't true, you can use dcollections as concrete classes without dealing with interfaces at all, and all the functions should be inlineable (at least, they will be when I change them all to final functions). dcollections provide all of the available generic-ness that you need. Ranges, template member functions, etc. AND they also implement interfaces if that should be part of your design. I've pointed out that a good use case for passing interfaces instead of concrete classes is for dynamic libraries, where you don't want to have private member changes affect runtime code. -Steve
May 20 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/20/2010 08:22 AM, Steven Schveighoffer wrote:
 Michel Fortin Wrote:

 On 2010-05-20 06:34:42 -0400, Steven
 Schveighoffer<schveiguy yahoo.com>  said:

 I understand these points, but I'm already using interfaces to
 copy data between containers.  I don't have to, I could have used
 generic code, but this way, only one function is instantiated to
 copy data from all the other containers.  The problem with using
 generic code is that the compiler will needlessly duplicate
 functions that are identical.

One question. Have you calculated the speed difference between using an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library.

It's not that much slower. You get a much higher speedup when things can be inlined than virtual vs. non-virtual. However, I should probably make all the functions in the concrete implementations final. I made several of them final, but I should do it across the board.

Yup. Even Java does that. I forgot whether it was a recanting of a former stance (as was the case with synchronized) or things were like that from the get-go.
 One thing I just thought of -- in dcollections, similar types can be
 compared to one another.  For example, you can check to see if a
 HashSet is equal to a TreeSet.  But that would not be possible
 without interfaces.

Of course it would be possible. You write a generic function that takes two generic container and constrain the inputs such that at least one has element lookup capability. (Complexity-oriented design for the win!) Andrei
May 21 2010
next sibling parent BCS <none anon.com> writes:
Hello Andrei,

 On 05/20/2010 08:22 AM, Steven Schveighoffer wrote:
 
 Michel Fortin Wrote:
 
 On 2010-05-20 06:34:42 -0400, Steven
 Schveighoffer<schveiguy yahoo.com>  said:
 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used
 generic code, but this way, only one function is instantiated to
 copy data from all the other containers.  The problem with using
 generic code is that the compiler will needlessly duplicate
 functions that are identical.
 

an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library.

can be inlined than virtual vs. non-virtual. However, I should probably make all the functions in the concrete implementations final. I made several of them final, but I should do it across the board.

former stance (as was the case with synchronized) or things were like that from the get-go.
 One thing I just thought of -- in dcollections, similar types can be
 compared to one another.  For example, you can check to see if a
 HashSet is equal to a TreeSet.  But that would not be possible
 without interfaces.
 

takes two generic container and constrain the inputs such that at least one has element lookup capability. (Complexity-oriented design for the win!) Andrei

Cool. Now how do I write code so that it will always iterate the collection with the bigger O() lookup time (O(n) before O(log2(n)) before O(log16(n)) before O(1))? :D -- ... <IXOYE><
May 21 2010
prev sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
BCS <none anon.com> wrote:

 Cool. Now how do I write code so that it will always iterate the  
 collection with the bigger O() lookup time (O(n) before O(log2(n))  
 before O(log16(n)) before O(1))? :D

Add a function. auto foo( R1, R2 )( R1 r1, R2 r2 ) if ( R1.complexity( 10_000 ) > R2.complexity( 10_000 ) ) { ... } auto foo( R1, R2 )( R2 r1, R1 r2 ) if ( R1.complexity( 10_000 ) < R2.complexity( 10_000 ) ) { ... } -- Simen
May 22 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/20/2010 06:17 AM, Michel Fortin wrote:
 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com>
 said:

 I understand these points, but I'm already using interfaces to copy
 data between containers. I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data
 from all the other containers. The problem with using generic code is
 that the compiler will needlessly duplicate functions that are identical.

One question. Have you calculated the speed difference between using an interface and using generic code? Surely going through all those virtual calls slows things down a lot. I do like interfaces in principle, but I fear it'll make things much slower when people implement things in term of interfaces. That's why I'm not sure it's a good idea to offer container interfaces in the standard library.

There will be differences, but let's keep in mind that that's one of several arguments against container interfaces. Andrei
May 21 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/20/2010 05:34 AM, Steven Schveighoffer wrote:
 Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 Does that make sense?

 -Steve

Yes and No. I understand where your coming from, but I think it's a bad idea. First, I think it needlessly expands the radius of comprehension needed to understand and use the library. (See Tangled up in tools http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I think designing a library to be flexible enough to meet some future, anticipated need (e.g. dlls) is a good idea, but actually implementing vaporous future needs is fraught with peril; it's too easy to guess wrong. Third, interface base design is viral; If library X uses interfaces then I have to use interfaces to interface with it. And if another library Y uses classes, then I'm going have to write a (needless) wrapper around one of them.

I understand these points, but I'm already using interfaces to copy data between containers. I don't have to, I could have used generic code, but this way, only one function is instantiated to copy data from all the other containers. The problem with using generic code is that the compiler will needlessly duplicate functions that are identical.

There is a copy() function that copies any range to any other range. It might need a revisit, but I think the way to go about copying is generic. Let's not forget that the code for copying itself is rather short and that applications don't tend to use a large number of container types.
 Using interfaces is not as viral as you think.  My interfaces can be
 used in generic code, as long as the generic code uses functions in
 the interfaces.  If a library returns an interface, the author is
 saying "I don't want you using any functions outside this interface,"
 so why is that a bad thing?

 Forcing people to *not* use interfaces has its drawbacks too.
 Dcollections gives the most flexible design I could muster, while
 still being useful.

 I'm not saying I'm against removing the interfaces until some later
 date, but I don't see any convincing arguments yet, especially since
 I've already seen benefits from having them.

What are the benefits that you have noticed? I think you'd need to back up the copying argument with some data if you want to frame it as a benefit. Andrei
May 21 2010
prev sibling next sibling parent reply superdan <super dan.org> writes:
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Andrei Alexandrescu Wrote:
 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.


derive from each other. But the interfaces are not detrimental to your ideas. The only thing interfaces require is that the entities implementing them are classes and not structs. As long as you agree that classes are the right call, then interfaces can co-exist with your other suggestions without interference. classes suck ass. structs give ye freedom 2 define copy ctors n shit. haven't seen andre agreein' classes are da way 2 go and i hope he don't. anyway u put together some cool shit. hope andre & u do a pow-wow n shit and adjust shit fer puttin' into phobos.
May 20 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
superdan Wrote:

 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Andrei Alexandrescu Wrote:
 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.


derive from each other. But the interfaces are not detrimental to your ideas. The only thing interfaces require is that the entities implementing them are classes and not structs. As long as you agree that classes are the right call, then interfaces can co-exist with your other suggestions without interference. classes suck ass. structs give ye freedom 2 define copy ctors n shit. haven't seen andre agreein' classes are da way 2 go and i hope he don't. anyway u put together some cool shit. hope andre & u do a pow-wow n shit and adjust shit fer puttin' into phobos.

I think classes are the right move. First, a collection makes more sense as a reference type. Note that both arrays and associative arrays are reference types. If collections are value types, like in C++, then copying a collection that is a node-based collection means duplicating all the nodes. Copy construction is essentially possible through a function -- dup. Having value-semantics makes it too easy to copy large amounts of heap data hurting performance. Many inexperienced C++ coders pass a std::set by value, not realizing why their code is so ridiculously slow. I think one of the things that makes D so damn fast is because large data structures such as arrays and AA's are always passed by reference. Second, since reference types are the right thing to do, classes are much easier to deal with. I know AA's are reference types that are structs, but the code needed to perform this feat is not trivial. The AA has only one member, a reference to the data struct, which is allocated on the heap. Any member function/property that is used on the AA must first check whether the implementation is allocated yet. The only benefit this gives you IMO is not having to use 'new' on it. And even that has some drawbacks. For example, pass an empty AA by value to a function, and if that function adds any data to it, it is lost. But pass an AA by value with one element in it, and the new data sticks. A class gives you much more in terms of options -- interfaces, builtin synchronization, runtime comparison, etc. And it forces full reference semantics by default. I think regardless of whether interfaces are defined for dcollections, classes give a better set of options than structs. Also note that I intend to make all dcollections classes final, so there will be no virtual calls as long as you have a reference to the concrete type. Is there some other reason to use structs besides copy construction? -Steve
May 21 2010
next sibling parent reply superdan <super dan.org> writes:
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Andrei Alexandrescu Wrote:
 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.


derive from each other. But the interfaces are not detrimental to your ideas. The only thing interfaces require is that the entities implementing them are classes and not structs. As long as you agree that classes are the right call, then interfaces can co-exist with your other suggestions without interference. classes suck ass. structs give ye freedom 2 define copy ctors n shit. haven't seen andre agreein' classes are da way 2 go and i hope he don't. anyway u put together some cool shit. hope andre & u do a pow-wow n shit and adjust shit fer puttin' into phobos.


If collections are value types, like in C++, then copying a collection that is a node-based collection means duplicating all the nodes. Copy construction is essentially possible through a function -- dup. Having value-semantics makes it too easy to copy large amounts of heap data hurting performance. Many inexperienced C++ coders pass a std::set by value, not realizing why their code is so ridiculously slow. I think one of the things that makes D so damn fast is because large data structures such as arrays and AA's are always passed by reference.
 Second, since reference types are the right thing to do, classes are much
easier

needed to perform this feat is not trivial. The AA has only one member, a reference to the data struct, which is allocated on the heap. Any member function/property that is used on the AA must first check whether the implementation is allocated yet. The only benefit this gives you IMO is not having to use 'new' on it. And even that has some drawbacks. For example, pass an empty AA by value to a function, and if that function adds any data to it, it is lost. But pass an AA by value with one element in it, and the new data sticks. A class gives you much more in terms of options -- interfaces, builtin synchronization, runtime comparison, etc. And it forces full reference semantics by default. I think regardless of whether interfaces are defined for dcollections, classes give a better set of options than structs.
 Also note that I intend to make all dcollections classes final, so there will
be

 Is there some other reason to use structs besides copy construction?
 -Steve

memory management n shit. with a struct u can use refcounting n malloc n realloc n shit. still stays a reference type. nothing gets fucked up. den there's all that null ref shit. with a class u have void foo(container!shit poo) { poo.addElement(Shit(diarrhea)); } dat works with struct but don't work with motherfucking classes. u need to write. void foo(container!shit poo) { if (!poo) poo = new container!shit; // fuck dat shit poo.addElement(Shit(diarrhea)); } u feel me?
May 21 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
superdan Wrote:

 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article

 Is there some other reason to use structs besides copy construction?
 -Steve

memory management n shit. with a struct u can use refcounting n malloc n realloc n shit. still stays a reference type. nothing gets fucked up.

This is not necessary with purely memory-based constructs -- the GC is your friend. The custom allocator ability in dcollections should provide plenty of freedom for memory allocation schemes.
 
 den there's all that null ref shit. with a class u have
 
 void foo(container!shit poo)
 {
     poo.addElement(Shit(diarrhea));
 }
 
 dat works with struct but don't work with motherfucking classes. u need to
write.
 
 void foo(container!shit poo)
 {
     if (!poo) poo = new container!shit; // fuck dat shit
     poo.addElement(Shit(diarrhea));
 }
 
 u feel me?

It doesn't work. Just with your example, you won't get what you expect. Try it with AA's. Even your class example doesn't make any sense. -Steve
May 21 2010
parent reply superdan <super dan.org> writes:
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Is there some other reason to use structs besides copy construction?
 -Steve

memory management n shit. with a struct u can use refcounting n malloc n realloc n shit. still stays a reference type. nothing gets fucked up.


freedom for memory allocation schemes. how do u set up yer custom allocator to free memory? u cant tell when its ok. copying refs iz under da radar. dats my point.
 den there's all that null ref shit. with a class u have

 void foo(container!shit poo)
 {
     poo.addElement(Shit(diarrhea));
 }

 dat works with struct but don't work with motherfucking classes. u need to
write.

 void foo(container!shit poo)
 {
     if (!poo) poo = new container!shit; // fuck dat shit
     poo.addElement(Shit(diarrhea));
 }

 u feel me?


wut? it don't work? whaddaya mean it dun work? is you crazy? what dun work? maybe therez sum misundercommunication. repeating. if container is struct this shit works: void foo(container!shit poo) { poo.addElement(Shit(diarrhea)); } dun tell me it dun work. i dun explain shit again. it works coz a struct cant be null. but a struct can be a ref if it only haz one pointer inside. methinks the builtin hash iz dat way. if container iz class dat shit dun work. u need to write dis shit: void foo(container!shit poo) { if(!poo) poo = new container!shit; // fuck dat shit poo.addElement(Shit(diarrhea)); } dat sucks bull ballz.
May 21 2010
parent BCS <none anon.com> writes:
Hello superdan,


 dun tell me it dun work. i dun explain shit again. it works coz a
 struct cant be null. but a struct can be a ref if it only haz one
 pointer inside. methinks the builtin hash iz dat way.
 

 void foo(container!shit poo)
 {
 if(!poo) poo = new container!shit; // fuck dat shit
 poo.addElement(Shit(diarrhea));
 }
 dat sucks bull ballz.

That code is broken, if poo is of a class type, that just new's a container, adds an element, and then drops the reference to get GC'ed. To many it do anything generally useful, it would need to have another level of indirection. -- ... <IXOYE><
May 21 2010
prev sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 05/21/2010 09:14 AM, Steven Schveighoffer wrote:
 Second, since reference types are the right thing to do, classes are much
easier to deal with.  I know AA's are reference types that are structs, but the
code needed to perform this feat is not trivial.  The AA has only one member, a
reference to the data struct, which is allocated on the heap.  Any member
function/property that is used on the AA must first check whether the
implementation is allocated yet.  The only benefit this gives you IMO is not
having to use 'new' on it.  And even that has some drawbacks.  For example,
pass an empty AA by value to a function, and if that function adds any data to
it, it is lost.  But pass an AA by value with one element in it, and the new
data sticks.  A class gives you much more in terms of options -- interfaces,
builtin synchronization, runtime comparison, etc.  And it forces full reference
semantics by default.  I think regardless of whether interfaces are defined for
dcollections, classes give a better set of options than structs.

Wow. A partially-nullable type. Great. Now I have to review everywhere I ever used an AA. Thanks, D. is there any serious drawback to something like (int[int]).init = InitializedAA!(int,int) ?
May 21 2010
parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 05/21/2010 11:55 AM, Ellery Newcomer wrote:
 On 05/21/2010 09:14 AM, Steven Schveighoffer wrote:
 Second, since reference types are the right thing to do, classes are
 much easier to deal with. I know AA's are reference types that are
 structs, but the code needed to perform this feat is not trivial. The
 AA has only one member, a reference to the data struct, which is
 allocated on the heap. Any member function/property that is used on
 the AA must first check whether the implementation is allocated yet.
 The only benefit this gives you IMO is not having to use 'new' on it.
 And even that has some drawbacks. For example, pass an empty AA by
 value to a function, and if that function adds any data to it, it is
 lost. But pass an AA by value with one element in it, and the new data
 sticks. A class gives you much more in terms of options -- interfaces,
 builtin synchronization, runtime comparison, etc. And it forces full
 reference semantics by default. I think regardless of whether
 interfaces are defined for dcollections, classes give a better set of
 options than structs.

Wow. A partially-nullable type. Great. Now I have to review everywhere I ever used an AA. Thanks, D. is there any serious drawback to something like (int[int]).init = InitializedAA!(int,int) ?

Or should one just always give an AA param either a const or ref modifier?
May 21 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the
 container interfaces are unable to express iteration is a corollary
 of the design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality.
 Then a few concept checks (a la std.range checks) can easily
 express what capabilities a given client function needs from a
 container.

This might have a simple answer. Dcollections implementations are not a hierarchy, just the interfaces are.

Without final, they are the roots of a hierarchy. But I understand you are making containers final, which is great.
 I.e. there aren't many
 kinds of HashMaps that derive from each other.  But the interfaces
 are not detrimental to your ideas.  The only thing interfaces require
 is that the entities implementing them are classes and not structs.
 As long as you agree that classes are the right call, then interfaces
 can co-exist with your other suggestions without interference.

This brings back a discussion I had with Walter a while ago, with echoes in the newsgroup. Basically the conclusion was as follows: if a container never escapes the addresses of its elements, it can manage its own storage. That forces, however, the container to be a struct because copying references to a class container would break that encapsulation. I called those "perfectly encapsulated containers" and I think they are good candidates for manual memory management because they tend to deal in relatively large chunks. I noticed that your collections return things by value, so they are good candidates for perfect encapsulation.
 Yes, if you want to define "this function needs something that is
 both addable and purgeable, I don't have an interface for that.  But
 a concept can certainly define that generically (which is what you
 want anyways), or you could just say "I need a List" and get those
 functions also.  It also does not force entities other than
 dcollections objects to be classes, they could be structs and
 implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it is
 mostly a carryover from the Java/Tango inspirations.

I don't know Tango, but Java's containers are a terrible example to follow. Java's container library is a ill-advised design on top of an underpowered language, patched later with some half-understood seeming of genericity. I think Java containers are a huge disservice to the programming community because they foster bad design.
 But I can see
 one good reason to keep them -- binary interoperability.  For
 example, it might be the case some day when D has good support with
 dynamic libraries that a library exposes some piece of itself as a
 Map or List interface.

I need to disagree with that. I've done and I do a ton of binary interoperability stuff. You never expose a generic container interface! Interoperable objects always embody high-level logic that is specific to the application. They might use containers inside, but they invariably expose high-level, application-specific functionality.
 So my answer is -- go ahead and define these concepts and required
 names, and you can ignore the interfaces if they don't interest you.
 They do not subtract from the possibilities, and others may find good
 use for them.

 Does that make sense?

I understand I could ignore the interfaces and call it a day, but it seems that at this point we are both convinced they are not quite good at anything: you only put them in because you suffered the Stockholm syndrome with Java, and I hate them with a passion. Why would we keep in the standard library bad design with the advice that "if you don't like it ignore it"? Andrei
May 21 2010
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu Wrote:
 
 I don't know Tango, but Java's containers are a terrible example to 
 follow. Java's container library is a ill-advised design on top of an 
 underpowered language, patched later with some half-understood seeming 
 of genericity. I think Java containers are a huge disservice to the 
 programming community because they foster bad design.

Tango's container library is or was a port of Doug Lea's containers for Java. While Doug Lea is an absolute master with concurrency, I never liked his container library very much. As you've said, the common abstractions it draws are weird and not terribly useful.
 I need to disagree with that. I've done and I do a ton of binary 
 interoperability stuff. You never expose a generic container interface! 
 Interoperable objects always embody high-level logic that is specific to 
 the application. They might use containers inside, but they invariably 
 expose high-level, application-specific functionality.

This. Iterators (or ranges) are passed all over the place, but when operating directly on a container I always want to know what kind of container I'm dealing with. Cost of operations is an issue, I may need to manually sort at some point or be aware that iterators will be invalidated, etc. Steve has already said he doesn't use the interfaces, and I'm a huge fan of not doing speculative design. It's invariably wrong and then you get stuck supporting it. I'd vote to drop the interfaces. They can always be added back later anyway.
May 21 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/24/2010 10:01 AM, Steven Schveighoffer wrote:
 On Fri, 21 May 2010 13:42:14 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 I.e. there aren't many kinds of HashMaps that derive from each
 other. But the interfaces are not detrimental to your ideas. The
 only thing interfaces require is that the entities implementing
 them are classes and not structs. As long as you agree that
 classes are the right call, then interfaces can co-exist with
 your other suggestions without interference.

This brings back a discussion I had with Walter a while ago, with echoes in the newsgroup. Basically the conclusion was as follows: if a container never escapes the addresses of its elements, it can manage its own storage. That forces, however, the container to be a struct because copying references to a class container would break that encapsulation. I called those "perfectly encapsulated containers" and I think they are good candidates for manual memory management because they tend to deal in relatively large chunks. I noticed that your collections return things by value, so they are good candidates for perfect encapsulation.

Then you need reference counting, I don't think that is a good design.

Why? I think refcounting for containers is perfect. Refcounting small objects is bad because the counter overhead is large. Also, small objects tend to be created and passed around a lot, so the time overhead is significant. In contrast, refcounting and containers are a perfect marriage. The container is a relatively large conglomerate of objects so refcounting that is bound to yield good benefits in terms of memory reclamation.
 Yes, if you want to define "this function needs something that
 is both addable and purgeable, I don't have an interface for
 that. But a concept can certainly define that generically (which
 is what you want anyways), or you could just say "I need a List"
 and get those functions also. It also does not force entities
 other than dcollections objects to be classes, they could be
 structs and implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it
 is mostly a carryover from the Java/Tango inspirations.

I don't know Tango, but Java's containers are a terrible example to follow. Java's container library is a ill-advised design on top of an underpowered language, patched later with some half-understood seeming of genericity. I think Java containers are a huge disservice to the programming community because they foster bad design.

Java has some warts as you have rightfully pointed out in the past (i.e. O(n) lookup), but I have attempted to remove all those warts. Dcollections I would hope does not suffer from the problems that Java's collections suffer from.

That's great. But let me quote what you said: "I myself don't really use the interface aspect of the classes, it is mostly a carryover from the Java/Tango inspirations." I took that to mean you don't have a strong justification for structuring dcollections as a hierarchy, and furthermore that makes me hope it's possible you'd be willing to yank that dinosaur brain.
 But I can see one good reason to keep them -- binary
 interoperability. For example, it might be the case some day when
 D has good support with dynamic libraries that a library exposes
 some piece of itself as a Map or List interface.

I need to disagree with that. I've done and I do a ton of binary interoperability stuff. You never expose a generic container interface! Interoperable objects always embody high-level logic that is specific to the application. They might use containers inside, but they invariably expose high-level, application-specific functionality.

It's done all the time in Java and .NET. For example, A GUI listbox widget exposes its elements as an array of elements, which implement the List interface. You don't ever see the implementation or need it. Granted Java and .NET have less problems than C++ and D with binary compatibility, since the function tables are dynamic, but the potential is there for D to make binary compatibility possible with interfaces.

I see.
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.
 So my answer is -- go ahead and define these concepts and
 required names, and you can ignore the interfaces if they don't
 interest you. They do not subtract from the possibilities, and
 others may find good use for them.

 Does that make sense?

I understand I could ignore the interfaces and call it a day, but it seems that at this point we are both convinced they are not quite good at anything: you only put them in because you suffered the Stockholm syndrome with Java, and I hate them with a passion.

The reason I put them in is because they existed before, but thinking about what they would be useful for, D doesn't really support dynamic libraries all that well (recent advances have brought that closer). But once it does, I think it will be much more important to keep compiled libraries compatible between versions of the standard library. This is typically not possible with templates, I know C++ is horrible at it from personal experience. So I'm not convinced it is not quite good at anything. I think it's a solution to a problem that doesn't exist yet on D because D's linking features are stuck in 1995.

Clearly interfaces have their advantages. I just happen to think they are much thinner than their disadvantages for this particular case.
 Why would we keep in the standard library bad design with the
 advice that "if you don't like it ignore it"?

People have continuously used that argument against const and immutable. Really that argument is fine, as long as you don't consider it bad design, which I don't :)

Looks like we're heading straight to stalemate once again. In the interest of time, it would be better to get to stalemate (or, hopefully, agreement) so we know whether dcollections will be integrated within Phobos or whether I should spend this and next weeks' evenings coding. To that end, please let me know whether it's worth that I spend time on putting together a list of proposed changes. Andrei
May 24 2010
parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits? -- Bruno Medeiros - Software Engineer
May 26 2010
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2010-05-27 12:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs. Such a thing isn't likely today since dynamic libs aren't very well supported in D, and even phobos or dcollections isn't a dynamic lib.

I've got my patch, for build Tango as a dynamic library on Mac, quite recently included in trunk. And I've also have a patch for druntime and Phobos in bugzilla just waiting to be included + one patch making it easier creating dynamic libraries directly with DMD. I would say it's a bad idea to still think that dynamic libraries aren't support, we have to think forward and assume they will be supported.
 And I have specifically decided not to use interfaces with ranges
 because that makes them reference types. Ranges work well as value
 types, but not well as reference types. Therefore, to use dcollections
 as interfaces, you must not require the range traits.

 -Steve

-- /Jacob Carlborg
May 28 2010
parent Jacob Carlborg <doob me.com> writes:
On 2010-05-28 14:12, Steven Schveighoffer wrote:
 On Fri, 28 May 2010 06:10:49 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-05-27 12:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs. Such a thing isn't likely today since dynamic libs aren't very well supported in D, and even phobos or dcollections isn't a dynamic lib.

I've got my patch, for build Tango as a dynamic library on Mac, quite recently included in trunk. And I've also have a patch for druntime and Phobos in bugzilla just waiting to be included + one patch making it easier creating dynamic libraries directly with DMD. I would say it's a bad idea to still think that dynamic libraries aren't support, we have to think forward and assume they will be supported.

I remember that, and I'm very encouraged by it. That being said, the ultimate goal is to have dmd be able to build dynamic libraries easily. D has had dynamic library "support" for years, but you have to do all kinds of manual stuff, and the standard library isn't dynamic. Until the standard library is dynamic, and I can build a dynamic library with a -shared equivalent switch, dynamic libs are a laboratory feature, and not many projects will use it.

Yes, exactly, I noticed there isn't an easy way to build dynamic libraries, among other things you have to know the path to the standard library when manually building.
 Just so you know, I think it's important to support binary compatibility
 in dcollections, and since std.container has not adopted dcollections,
 I'm going to keep interfaces. I was just pointing out the position
 others may have WRT binary support.

 -Steve

-- /Jacob Carlborg
May 28 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 27/05/2010 11:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs. Such a thing isn't likely today since dynamic libs aren't very well supported in D, and even phobos or dcollections isn't a dynamic lib.

Ah, nevermind, my mind slipped and I was thinking of any kind of library, that is, static ones as well. Although even just dynamic library compatibility seems to be a valid enough case that we should consider from the start, even if its not well supported currently. -- Bruno Medeiros - Software Engineer
May 28 2010
prev sibling parent Justin Spahr-Summers <Justin.SpahrSummers gmail.com> writes:
On Mon, 24 May 2010 11:01:06 -0400, Steven Schveighoffer 
<schveiguy yahoo.com> wrote:
 It's done all the time in Java and .NET.  For example, A GUI listbox  
 widget exposes its elements as an array of elements, which implement the  
 List interface.  You don't ever see the implementation or need it.   
 Granted Java and .NET have less problems than C++ and D with binary  
 compatibility, since the function tables are dynamic, but the potential is  
 there for D to make binary compatibility possible with interfaces.
 

Cocoa (NeXT's/Apple's framework for Objective-C) uses a very successful and well-thought-out delegation pattern, whereby GUI elements representing large amounts of data, like table views, have delegates (not in the D sense of the word) that provide them with the actual contents. Granted, Objective-C's runtime is much more dynamic than D, but a simplified version of such a pattern could still work in D. After all, user interfacing is typically where dynamism is more important than speed.
May 24 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 09:59 PM, Robert Jacques wrote:
 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

This might have a simple answer. Dcollections implementations are not a hierarchy, just the interfaces are. I.e. there aren't many kinds of HashMaps that derive from each other. But the interfaces are not detrimental to your ideas. The only thing interfaces require is that the entities implementing them are classes and not structs. As long as you agree that classes are the right call, then interfaces can co-exist with your other suggestions without interference. Yes, if you want to define "this function needs something that is both addable and purgeable, I don't have an interface for that. But a concept can certainly define that generically (which is what you want anyways), or you could just say "I need a List" and get those functions also. It also does not force entities other than dcollections objects to be classes, they could be structs and implement the correct concepts. I myself don't really use the interface aspect of the classes, it is mostly a carryover from the Java/Tango inspirations. But I can see one good reason to keep them -- binary interoperability. For example, it might be the case some day when D has good support with dynamic libraries that a library exposes some piece of itself as a Map or List interface. So my answer is -- go ahead and define these concepts and required names, and you can ignore the interfaces if they don't interest you. They do not subtract from the possibilities, and others may find good use for them. Does that make sense? -Steve

Yes and No. I understand where your coming from, but I think it's a bad idea. First, I think it needlessly expands the radius of comprehension needed to understand and use the library. (See Tangled up in tools http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools)

For the record, I strongly agree with this.
 Second, I
 think designing a library to be flexible enough to meet some future,
 anticipated need (e.g. dlls) is a good idea, but actually implementing
 vaporous future needs is fraught with peril; it's too easy to guess
 wrong. Third, interface base design is viral; If library X uses
 interfaces then I have to use interfaces to interface with it. And if
 another library Y uses classes, then I'm going have to write a
 (needless) wrapper around one of them.

That's a good argument as well. I like to put it a different way: you can get the advantages of an interface by wrapping a struct, but you can't get the advantages of a struct by wrapping an interface. Andrei
May 21 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 On 05/19/2010 09:59 PM, Robert Jacques wrote:
 Yes and No. I understand where your coming from, but I think it's a bad
 idea. First, I think it needlessly expands the radius of comprehension
 needed to understand and use the library. (See Tangled up in tools
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools)

For the record, I strongly agree with this.

I do too, but that's the easy part. Living up to those ideals is extremely hard, usually because most designers think they have designed simple interfaces when everyone else thinks they didn't. In this whole container discussion, I'd like to point out something Andrei pointed out to me. The one extremely successful example of pluggable components is the unix filter model. Essentially, one program pipes its output to the next, which pipes its output to the next, etc. The unix console is designed around that paradigm. If we can get anywhere close to that level of success with ranges and containers, we should all be well pleased.
May 21 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

Mike Taylor has a phrase for that I think is well-coined: "impedance matching", defined as the work necessary to get one library module to work with another library module. One example of bad impedance matching is C++ iostreams' attempt to make a memory buffer look like a file. Phobos propagated that mistake in its own streams.
May 21 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-21 22:55:16 -0400, Walter Bright <newshound1 digitalmars.com> said:

 Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

Mike Taylor has a phrase for that I think is well-coined: "impedance matching", defined as the work necessary to get one library module to work with another library module.

This makes me think about something. In principle, I like the idea of containers being reference type. It works well when passing a container to functions. But at the same time, I despite it. By-reference containers forces you to have extra indirections even when you don't need them, and you have to worry about null. Sometime a value-type would be better, when creating more complex data structures for instance: class Channel { private { Array!Message inbound; Array!Message outbound; } ... } What's the point of having extra indirection here? I've been thinking about having both by-value and by-reference containers. The first-class name would be given to the by-reference container to give it more visibility, but that by-reference container would be a simple wrapper for a by-value container "part" implemented in a struct: struct ArrayPart(T) { ... } // by value array container. class Array(T) { ArrayPart!T part; alias part this; } So now, if you want to reuse a container "part" to build some kind of more complex data structure, you can do it like this: class Channel { private { ArrayPart!Message inbound; ArrayPart!Message outbound; } ... } No silly extra indirection. That said, all this gives me a feeling of an overcomplicated design. After all, the problem is that you want to pass the container by reference in function arguments, but it's __too easy to forget the ref__. Perhaps that's the problem that should be fixed. Couldn't we just make a struct that cannot be implicitly copied? Perhaps something like this: explicitdup struct Array { } void testVal(Array array); void testRef(ref Array array); unittest { Array array; testVal(array); // error, cannot copy array implicitly testVal(array.dup); // ok, array is copied testRef(array); // ok, array is passed by reference } If there's already a way to achieve this, I couldn't find it. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 22 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Michel Fortin:
 and you have to worry about null.

Nonnull references of course solve this problem.
 Couldn't we just make a struct that cannot be implicitly copied? 

The disable attribute was invented for this purpose too: struct Foo { int x; disable this(this) {} } void main() { Foo f1; Foo f2 = f1; } That prints: test.d(7): Error: struct test9.Foo is not copyable because it is annotated with disable Bye, bearophile
May 22 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-22 08:16:27 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:
 Couldn't we just make a struct that cannot be implicitly copied?

The disable attribute was invented for this purpose too: struct Foo { int x; disable this(this) {} } void main() { Foo f1; Foo f2 = f1; } That prints: test.d(7): Error: struct test9.Foo is not copyable because it is annotated with disable

Indeed, thanks. I figured it out by myself while you were writing this. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 22 2010
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-22 07:56:31 -0400, Michel Fortin <michel.fortin michelf.com> said:

 	 explicitdup struct Array {  }
 
 	void testVal(Array array);
 	void testRef(ref Array array);
 
 	unittest {
 		Array array;
 		testVal(array);     // error, cannot copy array implicitly
 		testVal(array.dup); // ok, array is copied
 		testRef(array);     // ok, array is passed by reference
 	}
 
 If there's already a way to achieve this, I couldn't find it.

Apparently it's already achievable this way: struct Array { disable this(this); Array dup() { return Array(...); } ... } It also blocks simple assignments: Array array2 = array; // error, array is not copyable Array array3 = array.dup; // ok With this, I don't think we need containers to be reference types anymore. The naive error of copying containers everywhere without you knowing about it (as it occurs in C++) is simply no longer possible. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 22 2010
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Michel Fortin wrote:
 On 2010-05-21 22:55:16 -0400, Walter Bright <newshound1 digitalmars.com> 
 said:
 
 Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

Mike Taylor has a phrase for that I think is well-coined: "impedance matching", defined as the work necessary to get one library module to work with another library module.

This makes me think about something. In principle, I like the idea of containers being reference type. It works well when passing a container to functions. But at the same time, I despite it. By-reference containers forces you to have extra indirections even when you don't need them, and you have to worry about null. Sometime a value-type would be better, when creating more complex data structures for instance: class Channel { private { Array!Message inbound; Array!Message outbound; } ... } What's the point of having extra indirection here?

Good question. I think the answer is: 1. When do you ever want to copy a collection? I almost never do, because copying one is an inherently expensive operation. 2. When you copy a collection, do you copy the container or the elements in the container or both? One would have to carefully read the documentation to see which it is. That increases substantially the "cognitive load" of using them. 3. Going to lengths to make them value types, but then disabling copying them because you want people to use them as reference types, seems like a failure of design somewhere. 4. That silly extra level of indirection actually isn't there. Consider that even value locals are accessed via indirection: offset[ESP]. For a reference collection we have: offset[EBX]. No difference (yes, EBX has to be loaded, but if it is done more than once it gets cached in a register by the compiler). 5. Just making them all reference types means the documentation and use become a lot simpler.
May 22 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-22 16:01:37 -0400, Walter Bright <newshound1 digitalmars.com> said:

 Michel Fortin wrote:
 What's the point of having extra indirection here?

Good question. I think the answer is: 1. When do you ever want to copy a collection? I almost never do, because copying one is an inherently expensive operation.

Whenever you need to preserve the previous state of something before applying some transformation on it. But I agree that the copy should be explicit because it is O(n), hence my suggestion of disabling implicit copying for containers. Since we're at it, a reference types container sometime makes it too easy to just create a new reference to the same container when what you really want is to make a copy. I happen to have a bug of this sort to fix in my Objective-C program right now where a reference to a container leaked where it should have been a copy, causing unwanted mutations to the .
 2. When you copy a collection, do you copy the container or the 
 elements in the container or both? One would have to carefully read the 
 documentation to see which it is. That increases substantially the 
 "cognitive load" of using them.

I don't see the extra cognitive load. Assuming you disable implicit copying of the container, you'll have to use ".dup", which will work exactly as an array. The way items are copied are exactly the same as if the container was a reference type (you call a "dup" or equivalent function and things get copied).
 3. Going to lengths to make them value types, but then disabling 
 copying them because you want people to use them as reference types, 
 seems like a failure of design somewhere.

I agree in a way. But at the same time, forcing everyone to use a reference type when sometime a value-type would be more adequate also looks like a failure of design to me. To me, the best tradeoff seems to use a value-type because it's quite trivial to create a reference-type from a value type when you need it; the reverse is awkward.
 4. That silly extra level of indirection actually isn't there. Consider 
 that even value locals are accessed via indirection: offset[ESP]. For a 
 reference collection we have: offset[EBX]. No difference (yes, EBX has 
 to be loaded, but if it is done more than once it gets cached in a 
 register by the compiler).

Have you ever worked with containers of containers? Surely yes since D associative arrays are one of them. So assume we want to implement our associative arrays like this: class HashTable(Key, Value) { Array!(Tuple!(Hash!Key, TreeSet!(Tuple!(Key, Value)))) buckets; } Do you find it reasonable that the TreeSet be a reference type? Reference-type containers would mean one indirection and one extra allocated block for each bucket. Then add that 'Value' could itself be a struct or class containing its own container, and you're stuck with a third unnecessary level of indirection and extra calls to the GC allocate containers and/or check for null. Sound quite wasteful to me. In addition, those extra allocations add more logic to our hash table and thus more chances for bugs. Here I'm using a hash table as an example, the same problem applies to many other data structures, whether they are generic or specific to a particular problem. Container should be efficient and easy to use when composed one inside another. That's the greatest strengths of C++ value-type containers in my opinion.
 5. Just making them all reference types zeans the documentation and use 
 become a lot simpler.

Simpler to explain, maybe. Simpler to use, I have my doubts. You're just moving the burden to somewhere else. A reference-type container requires a "new Container()" somewhere, and some protection logic against null. In exchange, you don't need to write 'ref' in functions taking containers, and can easily copy references to the container everywhere (sometime too easily). But the reference-type benefits aren't entirely lost with a value-type, because it's trivial to change a value-type as a reference-type. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 23 2010
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:
 Scope class members should solve this.  It's been thrown around forever --  
 essentially, you should be able to define that a class member's storage is  
 contained within the owner object.

I too have proposed this idea, but in my opinion you can't design a big part of a standard lib on the base of a syntax/language optimization that doesn't exists yet. Bye, bearophile
May 24 2010
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

Mike Taylor has a phrase for that I think is well-coined: "impedance matching", defined as the work necessary to get one library module to work with another library module. One example of bad impedance matching is C++ iostreams' attempt to make a memory buffer look like a file. Phobos propagated that mistake in its own streams. A better way is to use ranges, where the file is accessed via a range and a memory buffer is accessed via a range. Then anything that operates on data is written to the range, not a fake file interface.
May 21 2010
prev sibling parent Alex Makhotin <alex bitprox.com> writes:
Steven Schveighoffer wrote:
 
 I myself don't really use the interface aspect of the classes, it is mostly a
carryover from the Java/Tango inspirations.  But I can see one good reason to
keep them -- binary interoperability.  For example, it might be the case some
day when D has good support with dynamic libraries that a library exposes some
piece of itself as a Map or List interface.
 

Hi, Steven. You made a good point on interoperability. Strict, precise, readable, interfaces, that's what I would like in the really good standard library for D. But, is D mature enough to this kind of possibilities, considering, at least, known bugs involving interfaces? I don't even feel myself free to use interfaces in my code because of undefined behavior it may cause. Shared libraries... Is this going to happen on Linux? When? -- Alex Makhotin, the founder of BITPROX, http://bitprox.com
May 21 2010
prev sibling next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
Graham St Jack Wrote:

 While I haven't read dcollections yet, I definitely agree with you about 
 not liking container hierarchies, and about the importance of support for 
 ranges.
 
 I hope Steven can be convinced that this is a good way to go :-).

All dcollections classes support ranges. I am already convinced of that :) -Steve
May 19 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-19 19:01:51 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 I wrote a solution to the problem in native D. It goes like this:
 
 alias Container!(int, addable | purgeable) Messerschmidt;
 
 void messWith(Messerschmidt i) {
      ... use i's capabilities to add and purge ...
 }

Are you sure this is necessary? I'm wondering how the above is different from: void messWith(C)(C i) if (IsAddable!C && IsPurgeable!C) { ... use i's capabilities to add and purge } where IsAddable just checks for an 'add' function and IsPurgeable checks for a 'purge' function. Obviously, one is a template and the other isn't. I'd expect the template function to be more performant since it doesn't require an indirection to call into the container. Is the need for runtime-swappable containers really common enough to justify adding it to the standard library? Won't adding this encourage people to use it without realizing the downside in performance and failed optimization opportunities because of the hidden dynamic dispatch? It's a quite nice idea, but I don't like the tradeoff. This criticism is valid for containers implementing interfaces too. In my first Java programs, I was always declaring variables as the List interface, then instantiating an ArrayList for them, thinking it'd make things more generic and easier to change later. Generic sometime is good, but if you do that with containers in D you're in for an important performance drop. Personally, I'd scrap anything that's not made of static calls (final functions in a class are fine) so people can't easily make these kind of mistakes (and then believe D is slow). Also, addable and purgeable above being or'ed constants makes the system difficult to scale to new concepts. The template predicates on the other hand are infinitely extendable: if my containers have 'commit' and 'rollback' functions, I can define IsTransactional to check for the presence of the functions and make some algorithms that benefits from this. In fact, this can apply to anything, not just containers. Range are already using this pattern. Wouldn't it make things easier to learn if we could just reuse the same principle with containers? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 19 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 09:48 PM, Michel Fortin wrote:
 On 2010-05-19 19:01:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 I wrote a solution to the problem in native D. It goes like this:

 alias Container!(int, addable | purgeable) Messerschmidt;

 void messWith(Messerschmidt i) {
 ... use i's capabilities to add and purge ...
 }

Are you sure this is necessary? I'm wondering how the above is different from: void messWith(C)(C i) if (IsAddable!C && IsPurgeable!C) { ... use i's capabilities to add and purge }

It's different in that messWith is a type-parameterized function (aka a template in C++) with the known tradeoffs: multiple instantiations, risk of code bloating, but good speed most of the time. Add to that that some people aren't comfortable with those.
 where IsAddable just checks for an 'add' function and IsPurgeable checks
 for a 'purge' function. Obviously, one is a template and the other
 isn't. I'd expect the template function to be more performant since it
 doesn't require an indirection to call into the container. Is the need
 for runtime-swappable containers really common enough to justify adding
 it to the standard library? Won't adding this encourage people to use it
 without realizing the downside in performance and failed optimization
 opportunities because of the hidden dynamic dispatch? It's a quite nice
 idea, but I don't like the tradeoff.

These arguments are in line with mine. I tried above to convey that the situation is not all-win.
 This criticism is valid for containers implementing interfaces too. In
 my first Java programs, I was always declaring variables as the List
 interface, then instantiating an ArrayList for them, thinking it'd make
 things more generic and easier to change later. Generic sometime is
 good, but if you do that with containers in D you're in for an important
 performance drop. Personally, I'd scrap anything that's not made of
 static calls (final functions in a class are fine) so people can't
 easily make these kind of mistakes (and then believe D is slow).

By the way, I dislike the name ArrayList. Is it just me, or "list" is most often associated with "linked list" in computer lingo? So when I see "ArrayList" it looks like an oxymoron. Steve, could I impose on you to rename ArrayList simply Array?
 Also, addable and purgeable above being or'ed constants makes the system
 difficult to scale to new concepts. The template predicates on the other
 hand are infinitely extendable: if my containers have 'commit' and
 'rollback' functions, I can define IsTransactional to check for the
 presence of the functions and make some algorithms that benefits from
 this. In fact, this can apply to anything, not just containers. Range
 are already using this pattern. Wouldn't it make things easier to learn
 if we could just reuse the same principle with containers?

Nice arguments! Andrei
May 21 2010
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/19/2010 07:57 PM, Bill Baxter wrote:
 On Wed, May 19, 2010 at 4:01 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a few
 concept checks (a la std.range checks) can easily express what capabilities
 a given client function needs from a container.

 Destroy me :o).

So instead of STL's concept hierarchy, you have essentially concept tags. Very Web 2.0. :-) I agree that there doesn't seem to be any coding benefit to STL's concepts being hierarchical. If you need a push_back(), you've got to check for push_back(). The main benefit seems to be for documentation purposes, allowing you to say things like "bidirectional_iterator has this and that, plus everything in forward_iterator". But that could easily be rephrased as "it has backward_iteration plus forward_iteration" with two pages describing those two tags. So I like the sound of it. But it seems actually a pretty small departure from the STL approach, in practice.

Well in fact STL has a concept hierarchy for iterators (which D also has for ranges), and a flat, unstructured approach to container. I don't mind keeping what STL does if there's no good reason. One change I do think is beneficial is making containers reference types by default. Andrei
May 21 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 I wrote a solution to the problem in native D. It goes like this:
 
 alias Container!(int, addable | purgeable) Messerschmidt;
 
 void messWith(Messerschmidt i) {
     ... use i's capabilities to add and purge ...
 }

I agree with Michael Fortin that the | is questionable. I'd like to suggest instead that it should instead be a variadic list of names, like: alias Container!(int, addable, purgeable) Msserschmidt; Perhaps the names should follow a naming convention, alias Container!(int, ContainerAddable, ContainerPurgeable) Msserschmidt; The problem with using scoped names, like Container.Addable, is scoped names cannot be added to.
May 21 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/21/2010 01:34 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I wrote a solution to the problem in native D. It goes like this:

 alias Container!(int, addable | purgeable) Messerschmidt;

 void messWith(Messerschmidt i) {
 ... use i's capabilities to add and purge ...
 }

I agree with Michael Fortin that the | is questionable. I'd like to suggest instead that it should instead be a variadic list of names, like: alias Container!(int, addable, purgeable) Msserschmidt; Perhaps the names should follow a naming convention, alias Container!(int, ContainerAddable, ContainerPurgeable) Msserschmidt; The problem with using scoped names, like Container.Addable, is scoped names cannot be added to.

Well the problem stays: compound interfaces grow combinatorially with the number of components, because an interface X!(A, B, C) inherits at the same time X!(A, B), X!(A, C), and X!(B, C). Check the epic inheritance diagram here: http://www.artima.com/cppsource/codefeatures2.html and the equally epic table here: http://www.artima.com/cppsource/codefeatures3.html Andrei
May 21 2010
prev sibling next sibling parent Graham St Jack <Graham.StJack internode.on.net> writes:
While I haven't read dcollections yet, I definitely agree with you about 
not liking container hierarchies, and about the importance of support for 
ranges.

I hope Steven can be convinced that this is a good way to go :-).
May 19 2010
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Wed, May 19, 2010 at 4:01 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a few
 concept checks (a la std.range checks) can easily express what capabilities
 a given client function needs from a container.

 Destroy me :o).

So instead of STL's concept hierarchy, you have essentially concept tags. Very Web 2.0. :-) I agree that there doesn't seem to be any coding benefit to STL's concepts being hierarchical. If you need a push_back(), you've got to check for push_back(). The main benefit seems to be for documentation purposes, allowing you to say things like "bidirectional_iterator has this and that, plus everything in forward_iterator". But that could easily be rephrased as "it has backward_iteration plus forward_iteration" with two pages describing those two tags. So I like the sound of it. But it seems actually a pretty small departure from the STL approach, in practice. --bb
May 19 2010
prev sibling next sibling parent reply Bernard Helyer <b.helyer gmail.com> writes:
Oooohhh goody goody goody!   n_n


I'm in the process of making a little toy project ATM. I'll shall 
integrate dcollections 2.0a into ASAP.
May 19 2010
parent reply Bernard Helyer <b.helyer gmail.com> writes:
On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody!  n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

ArrayList doesn't compile with warnings as it overrides opEquals, but doesn't mark it as such.
May 19 2010
parent reply Bernard Helyer <b.helyer gmail.com> writes:
On 20/05/10 18:11, Bernard Helyer wrote:
 On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody! n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

ArrayList doesn't compile with warnings as it overrides opEquals, but doesn't mark it as such.

And lines 772 and 780 complained about not being able to implicitly cast const(Thing) to Thing. Which is strange, because T was Thing. Inserting cast(Thing) seemed to 'fix' it. D=
May 19 2010
parent Bernard Helyer <b.helyer gmail.com> writes:
On 20/05/10 18:16, Bernard Helyer wrote:
 On 20/05/10 18:11, Bernard Helyer wrote:
 On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody! n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

ArrayList doesn't compile with warnings as it overrides opEquals, but doesn't mark it as such.

And lines 772 and 780 complained about not being able to implicitly cast const(Thing) to Thing. Which is strange, because T was Thing. Inserting cast(Thing) seemed to 'fix' it. D=

Sorry about the blow by blow, but the cursor thing seems to work well in my situation. Me likey dcollections very much so far. Go Steve!
May 19 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:
 Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

This might have a simple answer. Dcollections implementations are not a hierarchy, just the interfaces are. I.e. there aren't many kinds of HashMaps that derive from each other. But the interfaces are not detrimental to your ideas. The only thing interfaces require is that the entities implementing them are classes and not structs. As long as you agree that classes are the right call, then interfaces can co-exist with your other suggestions without interference. Yes, if you want to define "this function needs something that is both addable and purgeable, I don't have an interface for that. But a concept can certainly define that generically (which is what you want anyways), or you could just say "I need a List" and get those functions also. It also does not force entities other than dcollections objects to be classes, they could be structs and implement the correct concepts. I myself don't really use the interface aspect of the classes, it is mostly a carryover from the Java/Tango inspirations. But I can see one good reason to keep them -- binary interoperability. For example, it might be the case some day when D has good support with dynamic libraries that a library exposes some piece of itself as a Map or List interface. So my answer is -- go ahead and define these concepts and required names, and you can ignore the interfaces if they don't interest you. They do not subtract from the possibilities, and others may find good use for them. Does that make sense? -Steve

Yes and No. I understand where your coming from, but I think it's a bad idea. First, I think it needlessly expands the radius of comprehension needed to understand and use the library. (See Tangled up in tools http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I think designing a library to be flexible enough to meet some future, anticipated need (e.g. dlls) is a good idea, but actually implementing vaporous future needs is fraught with peril; it's too easy to guess wrong. Third, interface base design is viral; If library X uses interfaces then I have to use interfaces to interface with it. And if another library Y uses classes, then I'm going have to write a (needless) wrapper around one of them.
May 19 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:
 Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 Does that make sense?

 -Steve

Yes and No. I understand where your coming from, but I think it's a bad idea. First, I think it needlessly expands the radius of comprehension needed to understand and use the library. (See Tangled up in tools http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I think designing a library to be flexible enough to meet some future, anticipated need (e.g. dlls) is a good idea, but actually implementing vaporous future needs is fraught with peril; it's too easy to guess wrong. Third, interface base design is viral; If library X uses interfaces then I have to use interfaces to interface with it. And if another library Y uses classes, then I'm going have to write a (needless) wrapper around one of them.

I understand these points, but I'm already using interfaces to copy data between containers. I don't have to, I could have used generic code, but this way, only one function is instantiated to copy data from all the other containers. The problem with using generic code is that the compiler will needlessly duplicate functions that are identical.

This sounds like a failure of design. Why aren't you using ranges to do this?
 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in the  
 interfaces.  If a library returns an interface, the author is saying "I  
 don't want you using any functions outside this interface," so why is  
 that a bad thing?

Well, needlessly duplicated functions for one. :) More importantly, the example I gave was about third party libraries which I have no control over. So this solution explicitly doesn't work. And even if everyone used templates everywhere in order to be compatible with both interfaces and classes, isn't that a viral effect of having both?
 Forcing people to *not* use interfaces has its drawbacks too.   
 Dcollections gives the most flexible design I could muster, while still  
 being useful.

 I'm not saying I'm against removing the interfaces until some later  
 date, but I don't see any convincing arguments yet, especially since  
 I've already seen benefits from having them.

 -Steve

May 20 2010
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thu, 20 May 2010 04:42:35 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 interfaces

Does that imply that the most important methods are virtual? If so, say good-bye to inlining, and hello to an additional level of dereferencing. Without meaning any disrespect to all the work you did, allow me to say that I won't use a library that could be faster (without making usage too clumsy), but isn't. (I prefer my D programs to be as fast as reasonably possible - if I didn't care about speed, I'd use another language.) For the same reasons, I'd be disappointed if the library was admitted as-is into Phobos, since it doesn't live up to my personal ideal of what D should be. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
May 21 2010
parent BCS <none anon.com> writes:
Hello Vladimir,

 On Thu, 20 May 2010 04:42:35 +0300, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 
 interfaces
 

If so, say good-bye to inlining, and hello to an additional level of dereferencing.

From a technical standpoint there is no reason that a method needs to be called virtually from a class reference just because the same method gets called virtually from an interface reference. -- ... <IXOYE><
May 21 2010
prev sibling next sibling parent BLS <windevguy hotmail.de> writes:
Fantastic work Steve,
Pretty good design IMHO, not sure why .. but somehow the collection 
battle reminds me to

Steve Vai vs Ry Cooder
<vbg>
Bjoern
May 23 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 22 May 2010 16:58:12 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

No. Use length == 0. O(1) length is always supported for all collections.

One thing before I forget: I think any good collection abstraction must be concretized back to the classic collection instances. Singly-linked lists definitely can't be left off the list! It would be an epic failure. Imagine the papers! New York Times: "D has containers, but no singly-linked lists". The New Yorker: "D's abstractions are too abstract". The National Enquirer: "The Bat Boy stole D's singly-linked lists". Pyongyang Times: "Another failure of the so-called Western Democracy -- yet Juche doesn't need singly-linked lists anyway." Keeping the length cached in a singly-linked list is a costly mistake. It works against splicing (an important list primitive) and most of the time you don't need it so why waste time updating it. Adding any cruft beyond { T payload; List* next; } is very strong incentive for the coder to write their own. Why do you think they're using an SLL in the first place? Because it's simple and has efficient primitives!

length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't possible, but all dcollections define length. In that case, you can do coll.begin.empty to determine if the collection has any elements. But all dcollections are bi-directional anyways, there is no singly linked list. That was a decision I made early on, because it allows more assumptions about dcollections' containers. I think length-less singly linked lists would be a good addition to phobos that are not part of the collection hierarchy, they are almost on par with builtin arrays as being so simple. And singly linked vs. doubly linked does not make any difference whether O(1) length is possible or not. As you say, it's O(1) splicing or O(1) length, regardless of single or double links. I disagree with your assessment that length is a less used operation than splicing. I think I have never used splicing with std::list. C++'s default is O(1) length, and I followed that design.
 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of
 dcollections?

I haven't the slightest what a bimap is :) I am not an expert in collections or data structures, I just reimplement things I have understood. My implementations are basically copied from my algorithm book, and refined as much as I can do. That being said, if you have any implementation of a tree or hash, it should be easy to insert into dcollections. If you have ideas for other collection types (i.e. other than Map, Set, Multiset or List), then I can look into that if you point me at an implementation or have one of your own. I purposefully left out multi-map because I've never had a huge use for it, and it seemed like a awkward thing to create an interface for...

Tries of various kinds come up again and again. I don't think dcollections' current abstractions capture them, which further supports my point that containers have too much personality to be caught in the straitjacket of class hierarchies.

I am not familiar with tries, and dcollections has no class hierarchy. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 20 May 2010 12:46:59 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 Does that make sense?

 -Steve

Yes and No. I understand where your coming from, but I think it's a bad idea. First, I think it needlessly expands the radius of comprehension needed to understand and use the library. (See Tangled up in tools http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I think designing a library to be flexible enough to meet some future, anticipated need (e.g. dlls) is a good idea, but actually implementing vaporous future needs is fraught with peril; it's too easy to guess wrong. Third, interface base design is viral; If library X uses interfaces then I have to use interfaces to interface with it. And if another library Y uses classes, then I'm going have to write a (needless) wrapper around one of them.

I understand these points, but I'm already using interfaces to copy data between containers. I don't have to, I could have used generic code, but this way, only one function is instantiated to copy data from all the other containers. The problem with using generic code is that the compiler will needlessly duplicate functions that are identical.

This sounds like a failure of design. Why aren't you using ranges to do this?

Why are ranges necessarily better? I'm using the container's opApply, which I'd probably still use even if there were no interfaces. opApply allows much more possibilities in traversal than ranges which cannot use stack recursion without heap activity.
 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in the  
 interfaces.  If a library returns an interface, the author is saying "I  
 don't want you using any functions outside this interface," so why is  
 that a bad thing?

Well, needlessly duplicated functions for one. :) More importantly, the example I gave was about third party libraries which I have no control over. So this solution explicitly doesn't work. And even if everyone used templates everywhere in order to be compatible with both interfaces and classes, isn't that a viral effect of having both?

If a 3rd party library uses interfaces, it's probably for good reason. They most likely want to remain binary compatible with other libs, and/or want to abstract the implementation of some custom container type. If you don't like their requirements, don't use the library. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 21 May 2010 14:00:42 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/20/2010 05:34 AM, Steven Schveighoffer wrote:

 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data
 from all the other containers.  The problem with using generic code
 is that the compiler will needlessly duplicate functions that are
 identical.

There is a copy() function that copies any range to any other range. It might need a revisit, but I think the way to go about copying is generic.

This implies the space for the elements already exists in the target range. Copying data from a list to a set for instance can't just allocate space in the set and then use some generic copy function to copy the data in. If generic copying is to be used, it would be a copy function defined by the collection, not a standard one.
 Using interfaces is not as viral as you think.  My interfaces can be
 used in generic code, as long as the generic code uses functions in
 the interfaces.  If a library returns an interface, the author is
 saying "I don't want you using any functions outside this interface,"
 so why is that a bad thing?

 Forcing people to *not* use interfaces has its drawbacks too.
 Dcollections gives the most flexible design I could muster, while
 still being useful.

 I'm not saying I'm against removing the interfaces until some later
 date, but I don't see any convincing arguments yet, especially since
 I've already seen benefits from having them.

What are the benefits that you have noticed? I think you'd need to back up the copying argument with some data if you want to frame it as a benefit.

When I mean see benefits, the benefits are not data-related but design related -- more designs are possible with interfaces and generic programming combined than with generic programming alone. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 22 May 2010 07:56:31 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2010-05-21 22:55:16 -0400, Walter Bright <newshound1 digitalmars.com>  
 said:

 Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and  
 containers, we should all be well pleased.

matching", defined as the work necessary to get one library module to work with another library module.

This makes me think about something. In principle, I like the idea of containers being reference type. It works well when passing a container to functions. But at the same time, I despite it. By-reference containers forces you to have extra indirections even when you don't need them, and you have to worry about null. Sometime a value-type would be better, when creating more complex data structures for instance: class Channel { private { Array!Message inbound; Array!Message outbound; } ... } What's the point of having extra indirection here?

Scope class members should solve this. It's been thrown around forever -- essentially, you should be able to define that a class member's storage is contained within the owner object. a value-type node-based containers just don't work well -- it's too easy to escape references, and too easy to accidentally duplicate the entire node set when passing to functions. It works fine for arrays because the array has a very simple structure -- data and length. And the length and data are somewhat orthogonal. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 21 May 2010 13:42:14 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 I.e. there aren't many
 kinds of HashMaps that derive from each other.  But the interfaces
 are not detrimental to your ideas.  The only thing interfaces require
 is that the entities implementing them are classes and not structs.
 As long as you agree that classes are the right call, then interfaces
 can co-exist with your other suggestions without interference.

This brings back a discussion I had with Walter a while ago, with echoes in the newsgroup. Basically the conclusion was as follows: if a container never escapes the addresses of its elements, it can manage its own storage. That forces, however, the container to be a struct because copying references to a class container would break that encapsulation. I called those "perfectly encapsulated containers" and I think they are good candidates for manual memory management because they tend to deal in relatively large chunks. I noticed that your collections return things by value, so they are good candidates for perfect encapsulation.

Then you need reference counting, I don't think that is a good design.
 Yes, if you want to define "this function needs something that is
 both addable and purgeable, I don't have an interface for that.  But
 a concept can certainly define that generically (which is what you
 want anyways), or you could just say "I need a List" and get those
 functions also.  It also does not force entities other than
 dcollections objects to be classes, they could be structs and
 implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it is
 mostly a carryover from the Java/Tango inspirations.

I don't know Tango, but Java's containers are a terrible example to follow. Java's container library is a ill-advised design on top of an underpowered language, patched later with some half-understood seeming of genericity. I think Java containers are a huge disservice to the programming community because they foster bad design.

Java has some warts as you have rightfully pointed out in the past (i.e. O(n) lookup), but I have attempted to remove all those warts. Dcollections I would hope does not suffer from the problems that Java's collections suffer from.
 But I can see
 one good reason to keep them -- binary interoperability.  For
 example, it might be the case some day when D has good support with
 dynamic libraries that a library exposes some piece of itself as a
 Map or List interface.

I need to disagree with that. I've done and I do a ton of binary interoperability stuff. You never expose a generic container interface! Interoperable objects always embody high-level logic that is specific to the application. They might use containers inside, but they invariably expose high-level, application-specific functionality.

It's done all the time in Java and .NET. For example, A GUI listbox widget exposes its elements as an array of elements, which implement the List interface. You don't ever see the implementation or need it. Granted Java and .NET have less problems than C++ and D with binary compatibility, since the function tables are dynamic, but the potential is there for D to make binary compatibility possible with interfaces. In the past I have built a C++ library that abstracted features of the OS. My goal was to make it possible to dynamically load a module that abstracted things like setting the IP address of a network interface. My modules used std::string instead of char * to lookup services to get objects that implement the interface. Big mistake. On a later version of the standard C++ runtime, the private implementation of std::string changed, so the dynamically loaded libraries crashed horribly. No change in string's interface, just the private stuff changed, but because it's a template, the code that uses it necessarily has to be aware of it. We ended up ditching the standard C++ library's version of string, and used STLPort so we could control the library. I envision this same sort of problem would be likely with D collection objects that were not used via interfaces.
 So my answer is -- go ahead and define these concepts and required
 names, and you can ignore the interfaces if they don't interest you.
 They do not subtract from the possibilities, and others may find good
 use for them.

 Does that make sense?

I understand I could ignore the interfaces and call it a day, but it seems that at this point we are both convinced they are not quite good at anything: you only put them in because you suffered the Stockholm syndrome with Java, and I hate them with a passion.

The reason I put them in is because they existed before, but thinking about what they would be useful for, D doesn't really support dynamic libraries all that well (recent advances have brought that closer). But once it does, I think it will be much more important to keep compiled libraries compatible between versions of the standard library. This is typically not possible with templates, I know C++ is horrible at it from personal experience. So I'm not convinced it is not quite good at anything. I think it's a solution to a problem that doesn't exist yet on D because D's linking features are stuck in 1995.
 Why would we keep in the standard library bad design with the advice  
 that "if you don't like it ignore it"?

People have continuously used that argument against const and immutable. Really that argument is fine, as long as you don't consider it bad design, which I don't :) -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?

Yes, at that point it's an optimization. The only place where I assume length could be invalid is when working with the Iterator!V type, which might not be a dcollections object.
 but all dcollections define length.

I was hoping we're on the verge of agreeing to yank all interfaces and let the collection roam free. If that's a possibility, then your design isn't constrained to define length anymore unless in can.

I agree, removing all the interfaces would get rid of the need for NO_LENGTH_SUPPORT. But all dcollections define a valid length, so that point is kinda moot :) However, supporting non-length containers via generic programming vs. interfaces would be much smoother.
 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it looks like an obviously better design.
 But all dcollections are bi-directional anyways, there is no singly
 linked list.  That was a decision I made early on, because it allows  
 more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?

I don't see them as unable to express slists. They would break the design guidelines that I set for collections being iterable in both directions, but an slist fits fine within the List interface.
 And singly linked vs. doubly linked does not make any difference whether
 O(1) length is possible or not. As you say, it's O(1) splicing or O(1)
 length, regardless of single or double links.

 I disagree with your assessment that length is a less used operation
 than splicing. I think I have never used splicing with std::list. C++'s
 default is O(1) length, and I followed that design.

I didn't assess that.

I felt like you did. Your statement was: "It works against splicing (an important list primitive) and most of the time you don't need it so why waste time updating it." While you didn't specifically say it was used more, you mentioned the importance of splicing, and the non-importance of length. I guess I should have said, I disagree that splicing is more important than caching length.
 My main point was that if one defines an slist, in many cases one  
 defines it to be very small, simple, and cheap. Maintaining size will  
 come on the radar and would discourage the use of the abstraction in  
 favor of a hand-rolled implementation. This is failure to abstract.

All dcollections own their elements, it is not like an array, where there can be multiple references to subsets of the data. An slist would simply be an slist as you describe with a slightly bigger head. In other words, the head node has the length cache and allocator, and object monitor, and whatever else you need for the whole list. The nodes themselves would be a simple element and pointer. But the elements are never exposed except via ranges and cursors. The ranges and cursors cannot affect the topology of the list, you need the head "owner" node to do that. I look at it this way, dcollections are "good enough" for most uses, if you need highly tailored data structures with specific uses in mind, you can make those on your own. For example, dcollections' LinkList can easily take the place of a simple slist. It may not be as fast with your specific requirements in mind, but it works. The benefit is, it works like all the other collection types and it's easy to learn all of them together because they all have certain properties.
 I don't know of a word for "hierarchy with interfaces as root and inner  
 nodes and classes as leaf nodes" so I just call that class hierarchy.

I view them as classes with interfaces tacked on because the implementations are similar :) -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 11:21:20 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

Missed that upon the first read. I suggest you look at tries and the following other structures as good examples that it's futile to fit collections into hierarchies. http://en.wikipedia.org/wiki/Trie http://linux.thai.net/~thep/datrie/datrie.html http://en.wikipedia.org/wiki/Suffix_tree http://en.wikipedia.org/wiki/Kd-tree We'd want to implement in time those and many more in Phobos without worrying that some of their primitives won't fit the existing interfaces, and also without the unjustified effort of formalizing interfaces for each of them in thinking that another very, very similar container will come along.

From a cursory look, I don't see why tries would not be possible in dcollections. I'd probably start with something like this: class TrieMap(K, V) : Map!(K[], V) The array brackets enforce the ability to use prefixes on the keys. -Stvee
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 21 May 2010 22:56:54 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Thu, 20 May 2010 04:42:35 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 interfaces

Does that imply that the most important methods are virtual? If so, say good-bye to inlining, and hello to an additional level of dereferencing. Without meaning any disrespect to all the work you did, allow me to say that I won't use a library that could be faster (without making usage too clumsy), but isn't. (I prefer my D programs to be as fast as reasonably possible - if I didn't care about speed, I'd use another language.) For the same reasons, I'd be disappointed if the library was admitted as-is into Phobos, since it doesn't live up to my personal ideal of what D should be.

With the future update that all the classes are final, then they are not virtual as long as you don't use the interfaces. -Steve
May 24 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Mon, 24 May 2010 08:06:29 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:
 On Thu, 20 May 2010 12:46:59 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:


 I understand these points, but I'm already using interfaces to copy  
 data between containers.  I don't have to, I could have used generic  
 code, but this way, only one function is instantiated to copy data  
 from all the other containers.  The problem with using generic code is  
 that the compiler will needlessly duplicate functions that are  
 identical.

This sounds like a failure of design. Why aren't you using ranges to do this?

Why are ranges necessarily better? I'm using the container's opApply, which I'd probably still use even if there were no interfaces. opApply allows much more possibilities in traversal than ranges which cannot use stack recursion without heap activity.

Ranges are not necessarily better and may have some minor amount of overhead over a well optimized opApply. Then again, the opposite may be true. The point is that if the difference between opApply and a range is more than trivial, you've probably had a failure of design occur. Honestly, I'm having trouble thinking of a container which allows stack recursion for transversal and doesn't have an efficient range/loop variant. For that matter, a range using heap activity is also a clear indication of a design failure.
 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in  
 the interfaces.  If a library returns an interface, the author is  
 saying "I don't want you using any functions outside this interface,"  
 so why is that a bad thing?

Well, needlessly duplicated functions for one. :) More importantly, the example I gave was about third party libraries which I have no control over. So this solution explicitly doesn't work. And even if everyone used templates everywhere in order to be compatible with both interfaces and classes, isn't that a viral effect of having both?

If a 3rd party library uses interfaces, it's probably for good reason. They most likely want to remain binary compatible with other libs, and/or want to abstract the implementation of some custom container type. If you don't like their requirements, don't use the library. -Steve

No, if a 3rd party library _needs_ to use interfaces it's probably for a good reason. The problem is, if they exist, people are going to use them even if they don't need them. Which therein lies the problem.
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 11:45:44 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?

Yes, at that point it's an optimization. The only place where I assume length could be invalid is when working with the Iterator!V type, which might not be a dcollections object.

That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have interfaces, the door is ajar forever. Client code can't write this: auto total = cont1.length + cont2.length; because that is incorrect code (that compiles, runs, and produces some odd result). So don't take it lightly. NO_LENGTH_SUPPORT means no length support. Then pretending it's supported will only harm everyone.

I get what you are saying. What I meant was it's moot if you are not using interfaces. If you don't know what the underlying type is, then you have to do a runtime check. I agree it's awkward, and I could have not included length as a member of Iterator, in which case it would be on all the container types and NO_LENGTH_SUPPORT would not exist. All containers are meant to have a valid length, it is only with non-container Iterators that length can be NO_LENGTH_SUPPORT.
 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it looks like an obviously better design.
 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows  
 more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of  
 the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?

I don't see them as unable to express slists. They would break the design guidelines that I set for collections being iterable in both directions, but an slist fits fine within the List interface.

What's the required complexity of concat?

O(n) with n being the number of elements added
 Is add expected to put things in a specific place? How does slist  
 implement back()?

 slist does not fit the List interface.

A pointer to the end element would be required for both appending and back(). -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 11:51:39 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:39 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:21:20 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

Missed that upon the first read. I suggest you look at tries and the following other structures as good examples that it's futile to fit collections into hierarchies. http://en.wikipedia.org/wiki/Trie http://linux.thai.net/~thep/datrie/datrie.html http://en.wikipedia.org/wiki/Suffix_tree http://en.wikipedia.org/wiki/Kd-tree We'd want to implement in time those and many more in Phobos without worrying that some of their primitives won't fit the existing interfaces, and also without the unjustified effort of formalizing interfaces for each of them in thinking that another very, very similar container will come along.

From a cursory look, I don't see why tries would not be possible in dcollections. I'd probably start with something like this: class TrieMap(K, V) : Map!(K[], V) The array brackets enforce the ability to use prefixes on the keys.

The key point of tries is that they have distributed storage. Thus, shoehorning them into the interface function Iterator!(K[]) keys(); forces allocation and copying.

I wasn't thinking of that, I was thinking the key point of tries being more efficient at lookup for strings of things. One solution is that Trie could be a separate interface (i.e. a sibling of Map). One thing to point out is that D's arrays are adept at appending and prefixing. A K[] key would not necessarily be reallocating for each node traversal, but it certainly would be copying data. How would you define a way to get all the keys of a Trie? Or would you simply leave that function off? Is it unreasonable to expect to be able to iterate the keys in a trie? (I don't really know, I've never worked with them) -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 11:45:51 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:01 AM, Steven Schveighoffer wrote:
 On Fri, 21 May 2010 13:42:14 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 I.e. there aren't many kinds of HashMaps that derive from each
 other. But the interfaces are not detrimental to your ideas. The
 only thing interfaces require is that the entities implementing
 them are classes and not structs. As long as you agree that
 classes are the right call, then interfaces can co-exist with
 your other suggestions without interference.

This brings back a discussion I had with Walter a while ago, with echoes in the newsgroup. Basically the conclusion was as follows: if a container never escapes the addresses of its elements, it can manage its own storage. That forces, however, the container to be a struct because copying references to a class container would break that encapsulation. I called those "perfectly encapsulated containers" and I think they are good candidates for manual memory management because they tend to deal in relatively large chunks. I noticed that your collections return things by value, so they are good candidates for perfect encapsulation.

Then you need reference counting, I don't think that is a good design.

Why? I think refcounting for containers is perfect. Refcounting small objects is bad because the counter overhead is large. Also, small objects tend to be created and passed around a lot, so the time overhead is significant. In contrast, refcounting and containers are a perfect marriage. The container is a relatively large conglomerate of objects so refcounting that is bound to yield good benefits in terms of memory reclamation.

I meant refcounting elements. If you are simply refcounting the container, what do you do when someone wants to remove a node? Not allow it if more than one reference to the container exists?
 Java has some warts as you have rightfully pointed out in the past
 (i.e. O(n) lookup), but I have attempted to remove all those warts.
 Dcollections I would hope does not suffer from the problems that
 Java's collections suffer from.

That's great. But let me quote what you said: "I myself don't really use the interface aspect of the classes, it is mostly a carryover from the Java/Tango inspirations." I took that to mean you don't have a strong justification for structuring dcollections as a hierarchy, and furthermore that makes me hope it's possible you'd be willing to yank that dinosaur brain.

I did not have a strong justification for using interfaces originally, and my first interface design shows that. But I think with careful design, the interfaces in the D2 version are pretty useful in some situations, all of which could be essential to a project's design. In other words, I don't see the fact that Java or Tango's original collection package had interfaces as a "wart". The problem I see right now is, a lot of that utility is moot since D is not very good at dynamic linking. Since you necessarily have to statically link Phobos and dcollections, it makes no sense to keep binary compatibility, or hide implementation.
 Looks like we're heading straight to stalemate once again.

 In the interest of time, it would be better to get to stalemate (or,  
 hopefully, agreement) so we know whether dcollections will be integrated  
 within Phobos or whether I should spend this and next weeks' evenings  
 coding. To that end, please let me know whether it's worth that I spend  
 time on putting together a list of proposed changes.

I think if we can keep dcollections as classes, then the opportunity to have interfaces still exists for the future. If that's the case, we can ditch the interfaces for now, and revisit it when D's dynamic lib support gets better. So let's move on to other issues. I haven't changed my mind on interface utility, but there seems to be not much support for the idea. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 12:06:18 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Mon, 24 May 2010 08:06:29 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 On Thu, 20 May 2010 12:46:59 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:


 I understand these points, but I'm already using interfaces to copy  
 data between containers.  I don't have to, I could have used generic  
 code, but this way, only one function is instantiated to copy data  
 from all the other containers.  The problem with using generic code  
 is that the compiler will needlessly duplicate functions that are  
 identical.

This sounds like a failure of design. Why aren't you using ranges to do this?

Why are ranges necessarily better? I'm using the container's opApply, which I'd probably still use even if there were no interfaces. opApply allows much more possibilities in traversal than ranges which cannot use stack recursion without heap activity.

Ranges are not necessarily better and may have some minor amount of overhead over a well optimized opApply. Then again, the opposite may be true. The point is that if the difference between opApply and a range is more than trivial, you've probably had a failure of design occur.

The difference is trivial. I support both ranges and opApply. The main benefit from using opApply being only one function is compiled by the compiler.
 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in  
 the interfaces.  If a library returns an interface, the author is  
 saying "I don't want you using any functions outside this interface,"  
 so why is that a bad thing?

Well, needlessly duplicated functions for one. :) More importantly, the example I gave was about third party libraries which I have no control over. So this solution explicitly doesn't work. And even if everyone used templates everywhere in order to be compatible with both interfaces and classes, isn't that a viral effect of having both?

If a 3rd party library uses interfaces, it's probably for good reason. They most likely want to remain binary compatible with other libs, and/or want to abstract the implementation of some custom container type. If you don't like their requirements, don't use the library. -Steve

No, if a 3rd party library _needs_ to use interfaces it's probably for a good reason. The problem is, if they exist, people are going to use them even if they don't need them. Which therein lies the problem.

And if 3rd party X _needs_ to use interfaces, and 3rd party Y _needs_ to use interfaces, and your code depends on X and Y, and interfaces aren't defined by dcollections, where are you then? I don't think there's any way you can define a generic standard library such that people won't create bad designs, that's a lost cause. I think part of the problem with all this is that interfaces aren't likely to be needed any time soon (D's dynamic lib support is almost non-existent), so I'm probably going to drop the interfaces for now in the interest of progress. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 13:04:31 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 Scope class members should solve this.  It's been thrown around forever  
 --
 essentially, you should be able to define that a class member's storage  
 is
 contained within the owner object.

I too have proposed this idea, but in my opinion you can't design a big part of a standard lib on the base of a syntax/language optimization that doesn't exists yet.

Yeah, but essentially, it's an optimization. Whether the storage is stored in the same heap block or a different one really doesn't matter in terms of functionality. Allocating one heap block vs. two is faster, that's what the OP was saying. In other words, scope class members aren't essential to using dcollections. -Steve
May 24 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Steven Schveighoffer <schveiguy yahoo.com> wrote:

 And if 3rd party X _needs_ to use interfaces, and 3rd party Y _needs_ to  
 use interfaces, and your code depends on X and Y, and interfaces aren't  
 defined by dcollections, where are you then?

You're at the point where the language allows you to create a class following an interface, which whose all methods redirect to those of the struct. This could even be done automagically. -- Simen
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 13:11:27 -0400, Simen kjaeraas  
<simen.kjaras gmail.com> wrote:

 Steven Schveighoffer <schveiguy yahoo.com> wrote:

 And if 3rd party X _needs_ to use interfaces, and 3rd party Y _needs_  
 to use interfaces, and your code depends on X and Y, and interfaces  
 aren't defined by dcollections, where are you then?

You're at the point where the language allows you to create a class following an interface, which whose all methods redirect to those of the struct. This could even be done automagically.

A situation Robert has stated he would like to avoid. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie?  (I don't really know, I've never worked
 with them)

On tries you can iterate on all keys or on a sorted range of the keys.

There's one more way of iteration that's unique to tries - by key element (e.g. if key is string, you iterate by character). That would be an exploratory iteration. At each step in the iteration you pass a character, so popFront() takes an argument, it's popFront(char). trie.popFront('a') takes you to all elements in the trie that start with 'a'. Then you can continue exploring by e.g. trie.popFront('x') which takes you to all elements that start with "ax". At any point the range may become dry (nothing with this prefix). If the range is not empty, front() gives you information about all strings that share the particular prefix you have spanned (count and breadth comes to mind).

That could be supported. But iterating over the keys, is that not advisable? Would your trie iterator be the only way to iterate over the elements? It seems rather non-useful. Or is there another way to iterating the keys that would be more efficient than building an array to contain the 'key' for each node? -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 20 May 2010 02:11:58 -0400, Bernard Helyer <b.helyer gmail.com>  
wrote:

 On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody!  n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

ArrayList doesn't compile with warnings as it overrides opEquals, but doesn't mark it as such.

http://www.dsource.org/projects/dcollections/ticket/5 Any other problems, please create a ticket (including your Thing one, but I'm not sure what you were doing there :) -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 12:27:10 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 11:14 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:45:44 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?

Yes, at that point it's an optimization. The only place where I assume length could be invalid is when working with the Iterator!V type, which might not be a dcollections object.

That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have interfaces, the door is ajar forever. Client code can't write this: auto total = cont1.length + cont2.length; because that is incorrect code (that compiles, runs, and produces some odd result). So don't take it lightly. NO_LENGTH_SUPPORT means no length support. Then pretending it's supported will only harm everyone.

I get what you are saying. What I meant was it's moot if you are not using interfaces. If you don't know what the underlying type is, then you have to do a runtime check. I agree it's awkward, and I could have not included length as a member of Iterator, in which case it would be on all the container types and NO_LENGTH_SUPPORT would not exist. All containers are meant to have a valid length, it is only with non-container Iterators that length can be NO_LENGTH_SUPPORT.
 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it looks like an obviously better design.
 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows
 more
 assumptions about dcollections' containers. I think length-less  
 singly
 linked lists would be a good addition to phobos that are not part
 of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?

I don't see them as unable to express slists. They would break the design guidelines that I set for collections being iterable in both directions, but an slist fits fine within the List interface.

What's the required complexity of concat?

O(n) with n being the number of elements added
 Is add expected to put things in a specific place? How does slist
 implement back()?

 slist does not fit the List interface.

A pointer to the end element would be required for both appending and back().

This further erodes my confidence. Size needs to be maintained _and_ a pointer to the last element must be maintained. For many uses, all of this stuff is unnecessary overhead.

You make it sound like incrementing a counter and changing a pointer are incredibly slow operations :) The overhead of storage is minimal compared to the elements of the list, which do not contain said overhead.
 I think we need to build the shared vision that in Phobos we're  
 competing against hand-written, highly-optimized code. This is the fight  
 STL took and largely won. We can't afford to lower our standards for one  
 tiny bit.

 I was once talking over a beer about the advantages of container  
 abstractions. Walter slapped my face by defining the fastest SLL in the  
 world on a napkin. It looked like this:

 struct List { int v; List * next; }

 He took so little time to define that, and the primitives he needed were  
 so simple and fast, that it was an absolute overkill to replace that  
 with a generic whiz-bang container. If I'd mentioned there'd be any  
 extra data involved, that would mean the end of negotiation. "Why do you  
 need that?" "For length()!" "But I don't need length()!" "Well you have  
 to." "That's Communism in design!"

 And I agree.

But something like the above is prone to all sorts of nasty things, like inadvertent cycles in your list. Which may be acceptable to you if you want the bleeding fastest speed available. There will always be tailored code crafted to be as fast as possible for some specific design and dcollections and your slist just won't fit the bill. And dcollections' link list node is exactly what you wrote there (with a prev pointer of course). The only difference is, I don't define an element is the same as a list. The whole list has it's own data, including a pointer to the first element in the list. Look at D's arrays. Is appending with D's arrays the fastest it possibly could be? Hell no, but it's good enough for most situations, and safe. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 15:32:01 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 And the appending is hard to use too, see the ridiculously complex to  
 use & messy capacity, reserve and  and assumeSafeAppend. So overall it's  
 a good example of what I will never want to copy.

That part is still pretty new. What do you find is messy and complex about it? When I said "good enough for most situations" I didn't mean "most situations where appending speed is crucial to the success of the application" :P Most situations means you need to do appending once in a while. A perfect example is something like toStringz. And most situations do not require the three above mentioned functions. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 15:29:23 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 Look at D's arrays.  Is appending with D's arrays the fastest it  
 possibly
 could be?  Hell no, but it's good enough for most situations, and safe.

Append in D dynamic arrays is awful, it's slow and requires complex code.

complex code? a ~= b; Seems pretty not complex. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 15:58:46 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 01:12 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie? (I don't really know, I've never  
 worked
 with them)

On tries you can iterate on all keys or on a sorted range of the keys.

There's one more way of iteration that's unique to tries - by key element (e.g. if key is string, you iterate by character). That would be an exploratory iteration. At each step in the iteration you pass a character, so popFront() takes an argument, it's popFront(char). trie.popFront('a') takes you to all elements in the trie that start with 'a'. Then you can continue exploring by e.g. trie.popFront('x') which takes you to all elements that start with "ax". At any point the range may become dry (nothing with this prefix). If the range is not empty, front() gives you information about all strings that share the particular prefix you have spanned (count and breadth comes to mind).

That could be supported.

To paraphrase a commercial, "there's an interface for that" :o).

No there isn't. I wouldn't make that an interface, as it's not foreachable. I'd build a special range for it.
 But iterating over the keys, is that not advisable? Would your trie
 iterator be the only way to iterate over the elements? It seems rather
 non-useful.

I think a trie, like many other nontrivial containers, supports several means of iteration.

Including keys?!!! Still not a straight answer. If you should be able to iterate keys, then say you should be able to iterate keys. If you shouldn't, then say that.
 Or is there another way to iterating the keys that would be more
 efficient than building an array to contain the 'key' for each node?

I don't know. I'm just glad I can explore that without having to think in interfaces.

I don't really know what this means. The interfaces are a way to plug containers in at runtime. They are not meant to be the entire API for the collection. You can absolutely add whatever functionality that does not fit in the interface as an additional feature, and users will have to use the exact type in order to use that feature. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 16:35:03 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Andrei Alexandrescu:
 When was the last time you measured? I thought the speed has largely
 improved since Steve integrated his work.

I have timed it after that integration. I have seen a performance improvement, but it's small. I can perform some syntactic benchmarks.

Performance was vastly improved for situations where one was appending to multiple arrays at once. For appending to a single array in a loop, it is improved, but not that much. But the main improvement was for safety. The append operation on D's dynamic arrays is a tradeoff between footprint, speed, and safety. It also does not and should not compromise performance on operations besides appending. You will always be able to outperform the D append operation with more focus on the append operation, but D's array appending is fine for most situations. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 16:07:22 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 01:49 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 12:27:10 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 A pointer to the end element would be required for both appending and
 back().

This further erodes my confidence. Size needs to be maintained _and_ a pointer to the last element must be maintained. For many uses, all of this stuff is unnecessary overhead.

You make it sound like incrementing a counter and changing a pointer are incredibly slow operations :)

They are. Ask Walter for a good example.

Actually, I realize that pointing to the end node isn't good enough, because once you remove that node, there is no way to get to the previous node without traversing the list again. So slist cannot implement the List interface, you were right.
 The overhead of storage is minimal
 compared to the elements of the list, which do not contain said  
 overhead.

I'm talking containers of lists here.

I was more talking about the ratio of elements to lists. For example, the Hash implementation uses the same link list nodes as elements in its table, and generally a hash has very few elements per list, so overhead on the list head would be too much. On top of that, a LinkList has its own allocator, meaning each one consumes a large chunk of data assuming you will be adding lots of elements to it. So maybe we need to refine the implementation building blocks that back the dcollections classes so they are useable in special situations where performance is critical. It actually would be pretty trivial to define SLink as a specialization of dcollections' Link struct, and that would essentially give you a lot of functionality in terms of your ideal list type.
 Which may be acceptable to you if you
 want the bleeding fastest speed available. There will always be tailored
 code crafted to be as fast as possible for some specific design and
 dcollections and your slist just won't fit the bill.

Except that the set is considerably smaller for a well-designed slist.

The set of problems that an slist solves is also considerably smaller. Having that backwards capability enables more algorithms. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

OK, so we should be able to iterate keys. And the keys are not stored in the trie structure itself. So how does one iterate the keys of the container without reconstructing them from the trie nodes using the heap? Forget about interfaces, pretend you were writing it separate from dcollections. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 17:35:11 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

OK, so we should be able to iterate keys. And the keys are not stored in the trie structure itself. So how does one iterate the keys of the container without reconstructing them from the trie nodes using the heap?

You can't. At some point you need to copy tree traces into T[] arrays. If the trie stores parent nodes, you don't need to do that as you can expose a trace as a double-ended range containing the key. There's a kind of trie that was defined to avoid such issues; it stores the keys in clear, in the leaves, at the expense of duplication. I don't remember the name offhand. Does anyone?

OK, so the keys function of Map should work just fine for a Trie implementation. Good to know. -Steve
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 24 May 2010 17:47:18 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:38 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 17:35:11 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

OK, so we should be able to iterate keys. And the keys are not stored in the trie structure itself. So how does one iterate the keys of the container without reconstructing them from the trie nodes using the heap?

You can't. At some point you need to copy tree traces into T[] arrays. If the trie stores parent nodes, you don't need to do that as you can expose a trace as a double-ended range containing the key. There's a kind of trie that was defined to avoid such issues; it stores the keys in clear, in the leaves, at the expense of duplication. I don't remember the name offhand. Does anyone?

OK, so the keys function of Map should work just fine for a Trie implementation. Good to know.

This wins a little battle but not the war. Long term you're facing the sysyphic job of either knocking new containers into the existing interfaces, or extending the interfaces to accommodate new containers. It doesn't look to me like a winning proposition.

It's always easy to say that there may come a day when the interfaces don't work -- none of us can see the future. When that day comes, we can find a solution to it. The worst case scenario is that you simply don't implement any interfaces. It does not detract from the interfaces that exist. It would seem odd that some of the collections do not implement the interfaces while others do. At the very least though, all containers should be iterable, meaning they will implement the Iterator!V interface. That at least allows it to interact with the other container types. On the flip side, if containers did not implement interfaces, having to do this: class WrappedSet!(alias Impl, V) : Set!V { private Impl!V impl; int functionToSatisfySet() { return impl.functionToSatisfySet(); } ... } seems to me like a lot more crufty and bloated than simply adding : Set!V on the end of the class declarations. But I'm willing to drop the interfaces for now since interfaces obviously are an unpopular choice. -Steve
May 24 2010
prev sibling next sibling parent "Simen kjaeraas" <simen.kjaras gmail.com> writes:
Steven Schveighoffer <schveiguy yahoo.com> wrote:
 On the flip side, if containers did not implement interfaces, having to  
 do this:

 class WrappedSet!(alias Impl, V) : Set!V
 {
     private Impl!V impl;
     int functionToSatisfySet() { return impl.functionToSatisfySet(); }
     ...
 }

 seems to me like a lot more crufty and bloated than simply adding :  
 Set!V on the end of the class declarations.

This would not be necessary. We can get the function names off of the interface, so we could have a template do this for us. -- Simen
May 24 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs. Such a thing isn't likely today since dynamic libs aren't very well supported in D, and even phobos or dcollections isn't a dynamic lib. And I have specifically decided not to use interfaces with ranges because that makes them reference types. Ranges work well as value types, but not well as reference types. Therefore, to use dcollections as interfaces, you must not require the range traits. -Steve
May 27 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 19 May 2010 17:18:04 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 Andrej Mitrovic Wrote:


 Well as long as you're here can I submit an error here? :)

 I get an error while building the D2 branch:

 Error: ArrayMultiset.obj : No such file or directory

Crud, I admit that I assumed anything that built on Linux would build on Windows. I still believe it will, but of course, I need to change the batch file :)
 It seems the ArrayMultiset.d is not in the dcollections folder for the  
 D2.x  branch, although it is in trunk (I guess the trunk is the D1.x  
 one?).

Yes, D1 is trunk, D2 is going to eventually be trunk
 Is that module deprecated for d2.x? (although it's listed in the win32  
 batch file)

Yes, just remove it. I'll fix it when I get a chance.

I've fixed the windows build files and recreated the zip/tarball files for 2.0a. Learned some batch-fu online and hopefully the new versions will be future-proof. Please re-download the zipfile to build with windows. Thanks -Steve
May 27 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 28 May 2010 06:24:26 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 On 27/05/2010 11:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs. Such a thing isn't likely today since dynamic libs aren't very well supported in D, and even phobos or dcollections isn't a dynamic lib.

Ah, nevermind, my mind slipped and I was thinking of any kind of library, that is, static ones as well. Although even just dynamic library compatibility seems to be a valid enough case that we should consider from the start, even if its not well supported currently.

I agree, that's why dcollections has interfaces. I'm keeping them there since std.container has gone its own direction. -Steve
May 28 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 28 May 2010 06:10:49 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-05-27 12:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal interface if so needed.

I don't understand this discussion: isn't the reason above pretty much a dead-on hard requirement for the collections to have interfaces? Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs. Such a thing isn't likely today since dynamic libs aren't very well supported in D, and even phobos or dcollections isn't a dynamic lib.

I've got my patch, for build Tango as a dynamic library on Mac, quite recently included in trunk. And I've also have a patch for druntime and Phobos in bugzilla just waiting to be included + one patch making it easier creating dynamic libraries directly with DMD. I would say it's a bad idea to still think that dynamic libraries aren't support, we have to think forward and assume they will be supported.

I remember that, and I'm very encouraged by it. That being said, the ultimate goal is to have dmd be able to build dynamic libraries easily. D has had dynamic library "support" for years, but you have to do all kinds of manual stuff, and the standard library isn't dynamic. Until the standard library is dynamic, and I can build a dynamic library with a -shared equivalent switch, dynamic libs are a laboratory feature, and not many projects will use it. Just so you know, I think it's important to support binary compatibility in dcollections, and since std.container has not adopted dcollections, I'm going to keep interfaces. I was just pointing out the position others may have WRT binary support. -Steve
May 28 2010