digitalmars.D.announce - dcollections 1.0 and 2.0a beta released

Steven Schveighoffer (47/47) May 19 2010 After much work and toil, I have created dcollections for D2. I think I...

Yao G. (5/52) May 19 2010 Thanks Steven. I was waiting for this for a long time.
Ellery Newcomer (5/5) May 19 2010 Are the collections supposed to not have isEmpty members?

Steven Schveighoffer (5/8) May 19 2010 I haven't the slightest what a bimap is :) I am not an expert in collec...

Ellery Newcomer (8/16) May 19 2010 I think boost.bimap is where I saw it, though I don't don't use c++.
Andrei Alexandrescu (20/38) May 22 2010 One thing before I forget: I think any good collection abstraction must

bearophile (10/13) May 22 2010 ... LinkedList(T, bool WithLength=True) {
Sean Kelly (2/8) May 22 2010 We could always say we were following C++'s lead :-)

Andrei Alexandrescu (8/16) May 22 2010 C++(0|1)x has forward_list. Needless to say, supporting size() at

Steven Schveighoffer (19/60) May 24 2010 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't

Andrei Alexandrescu (16/55) May 24 2010 I was hoping we're on the verge of agreeing to yank all interfaces and

Steven Schveighoffer (38/72) May 24 2010 Yes, at that point it's an optimization. The only place where I assume ...

Andrei Alexandrescu (17/91) May 24 2010 That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have

Steven Schveighoffer (14/61) May 24 2010 I get what you are saying. What I meant was it's moot if you are not

Andrei Alexandrescu (20/86) May 24 2010 This further erodes my confidence. Size needs to be maintained _and_ a

Steven Schveighoffer (17/111) May 24 2010 You make it sound like incrementing a counter and changing a pointer are...

bearophile (4/6) May 24 2010 Append in D dynamic arrays is awful, it's slow and requires complex code...

bearophile (3/3) May 24 2010 And the appending is hard to use too, see the ridiculously complex to us...

Steven Schveighoffer (11/14) May 24 2010 That part is still pretty new. What do you find is messy and complex

Steven Schveighoffer (6/11) May 24 2010 complex code?

bearophile (9/17) May 24 2010 The internal semantics of the current design of dynamic arrays is not tr...

Andrei Alexandrescu (4/8) May 24 2010 When was the last time you measured? I thought the speed has largely

bearophile (4/6) May 24 2010 I have timed it after that integration. I have seen a performance improv...

Steven Schveighoffer (13/18) May 24 2010 Performance was vastly improved for situations where one was appending t...

Andrei Alexandrescu (9/52) May 24 2010 I'm talking containers of lists here.

Steven Schveighoffer (21/43) May 24 2010 Actually, I realize that pointing to the end node isn't good enough,

Pelle (3/4) May 24 2010 While I do agree with that design for a list, that is no reference type.

Andrei Alexandrescu (6/10) May 24 2010 List* is.

Andrei Alexandrescu (14/15) May 24 2010 Missed that upon the first read. I suggest you look at tries and the

Steven Schveighoffer (8/22) May 24 2010 From a cursory look, I don't see why tries would not be possible in

Andrei Alexandrescu (6/31) May 24 2010 The key point of tries is that they have distributed storage. Thus,

Steven Schveighoffer (14/49) May 24 2010 I wasn't thinking of that, I was thinking the key point of tries being

bearophile (4/7) May 24 2010 On tries you can iterate on all keys or on a sorted range of the keys.

Andrei Alexandrescu (13/18) May 24 2010 There's one more way of iteration that's unique to tries - by key

Steven Schveighoffer (9/27) May 24 2010 That could be supported.

Andrei Alexandrescu (7/35) May 24 2010 I think a trie, like many other nontrivial containers, supports several

Steven Schveighoffer (13/50) May 24 2010 No there isn't. I wouldn't make that an interface, as it's not

Andrei Alexandrescu (3/45) May 24 2010 Sorry. Yes, by-key iteration should be possible.

Steven Schveighoffer (8/9) May 24 2010 OK, so we should be able to iterate keys. And the keys are not stored i...

Andrei Alexandrescu (8/15) May 24 2010 You can't. At some point you need to copy tree traces into T[] arrays.

Steven Schveighoffer (5/21) May 24 2010 OK, so the keys function of Map should work just fine for a Trie

Andrei Alexandrescu (11/33) May 24 2010 This wins a little battle but not the war. Long term you're facing the

Steven Schveighoffer (23/53) May 24 2010 It's always easy to say that there may come a day when the interfaces

Simen kjaeraas (5/15) May 24 2010 This would not be necessary. We can get the function names off of the

dsimcha (7/7) May 19 2010 Sheer awesomeness! I've been waiting for along time for a decent collec...

Steven Schveighoffer (3/6) May 19 2010 Yes, that is part of my semi-documented long term plan. I actually am p...

Graham Fawcett (18/79) May 19 2010 Cool!

Graham Fawcett (4/32) May 19 2010 Apologies; the "envy for go packages" thread is on the D list, not on

Andrej Mitrovic (6/66) May 19 2010 Well as long as you're here can I submit an error here? :)

Steven Schveighoffer (5/14) May 19 2010 Yes, D1 is trunk, D2 is going to eventually be trunk

Steven Schveighoffer (8/26) May 27 2010 I've fixed the windows build files and recreated the zip/tarball files f...

Steven Schveighoffer (11/24) May 19 2010 That sounds interesting, it might be doable like this:

Ellery Newcomer (13/28) May 19 2010 When I was using it, it was usually more than an iteration that I found

Andrei Alexandrescu (22/23) May 19 2010 Awesome! I'm looking through it, unfortunately after wandering in the

Steven Schveighoffer (6/28) May 19 2010 It depends on what you want to change. If you say cursors have to go, t...

Andrei Alexandrescu (74/91) May 19 2010 Well the thing is I'm strongly questioning the whole point of defining a...

bearophile (5/6) May 19 2010 You ideas are surely interesting, but I don't think there's a simple way...

Steven Schveighoffer (4/9) May 19 2010 If it's at all possible, I'd like to cooperate on making dcollections ph...

bearophile (4/5) May 19 2010 Very good. Sometimes I am a too much pessimistic person :-)
Andrei Alexandrescu (7/22) May 21 2010 I'm very grateful to hear that; as I said, dcollections embody a great

Andrei Alexandrescu (5/11) May 20 2010 I am sorry, but I've been quite sick since yesterday afternoon. I still

bearophile (4/5) May 20 2010 Get better and in the meantime don't worry for D.

Andrei Alexandrescu (5/13) May 21 2010 I actually believe there is a simple transition path. In essence the

Graham St Jack (4/4) May 19 2010 While I haven't read dcollections yet, I definitely agree with you about...

Steven Schveighoffer (3/8) May 19 2010 All dcollections classes support ranges. I am already convinced of that...

Bill Baxter (14/19) May 19 2010 So instead of STL's concept hierarchy, you have essentially concept

Andrei Alexandrescu (6/25) May 21 2010 Well in fact STL has a concept hierarchy for iterators (which D also has...

Steven Schveighoffer (7/15) May 19 2010 This might have a simple answer. Dcollections implementations are not a...

Robert Jacques (13/47) May 19 2010 Yes and No. I understand where your coming from, but I think it's a bad ...

Steven Schveighoffer (6/23) May 20 2010 I understand these points, but I'm already using interfaces to copy data...

Michel Fortin (12/17) May 20 2010 One question. Have you calculated the speed difference between using an

bearophile (11/13) May 20 2010 Right. But the purpose of a good compiler is to kill those, devirtualizi...

Michel Fortin (34/50) May 20 2010 Devirtualization is only possible in certain cases: when the function

bearophile (15/19) May 20 2010 You are wrong, in most cases there are ways to de-virtualize, even when ...

Michel Fortin (18/31) May 20 2010 I know I simplified a bit, and I'll admit you may know more than me
Andrei Alexandrescu (6/13) May 21 2010 If we get deeper into this branch, we'll forget where we started from

Andrei Alexandrescu (6/55) May 21 2010 (I assume you meant upcasting.) Even then, it's possible to accelerate

Walter Bright (3/20) May 21 2010 Unfortunately, it's an agonizingly slow process. It also does nothing fo...

Steven Schveighoffer (5/21) May 20 2010 It's not that much slower. You get a much higher speedup when things ca...

Pelle (5/7) May 20 2010 I'm sorry, but I think that's a misfeature. In my opinion, a tree is not...

Steven Schveighoffer (3/10) May 20 2010 They aren't trees and hashtables, they are sets. The "quick lookup" fea...
Andrei Alexandrescu (4/13) May 21 2010 Yes. By the way, TDPL's dogma imposes that a == b for classes means they...

Michel Fortin (25/53) May 20 2010 Yes, but that was part of the equation: a call to a template function

Steven Schveighoffer (6/39) May 20 2010 Because comparing two objects for equality now calls this function:

bearophile (4/5) May 20 2010 In current D compilers virtual functions prevent inlining. This can give...

Steven Schveighoffer (4/9) May 20 2010 People seem to think that because dcollections implement interfaces, you...

Andrei Alexandrescu (8/36) May 21 2010 Yup. Even Java does that. I forgot whether it was a recanting of a

BCS (6/51) May 21 2010 Cool. Now how do I write code so that it will always iterate the collect...

Simen kjaeraas (12/15) May 22 2010 Add a function.

Andrei Alexandrescu (4/18) May 21 2010 There will be differences, but let's keep in mind that that's one of

Robert Jacques (9/47) May 20 2010 This sounds like a failure of design. Why aren't you using ranges to do ...

Steven Schveighoffer (11/55) May 24 2010 Why are ranges necessarily better? I'm using the container's opApply,

Robert Jacques (14/49) May 24 2010 [snip]

Steven Schveighoffer (15/64) May 24 2010 The difference is trivial. I support both ranges and opApply. The main...

Simen kjaeraas (6/9) May 24 2010 You're at the point where the language allows you to create a class

Steven Schveighoffer (4/11) May 24 2010 A situation Robert has stated he would like to avoid.

Andrei Alexandrescu (8/44) May 21 2010 There is a copy() function that copies any range to any other range. It

Steven Schveighoffer (12/37) May 24 2010 This implies the space for the elements already exists in the target

Andrei Alexandrescu (6/63) May 21 2010 That's a good argument as well. I like to put it a different way: you

Walter Bright (11/18) May 21 2010 I do too, but that's the easy part. Living up to those ideals is extreme...

Walter Bright (6/8) May 21 2010 Mike Taylor has a phrase for that I think is well-coined: "impedance mat...

Michel Fortin (56/63) May 22 2010 This makes me think about something.

bearophile (15/17) May 22 2010 The @disable attribute was invented for this purpose too:

Michel Fortin (6/24) May 22 2010 Indeed, thanks. I figured it out by myself while you were writing this.

Michel Fortin (17/30) May 22 2010 Apparently it's already achievable this way:
Walter Bright (16/48) May 22 2010 Good question. I think the answer is:

Michel Fortin (52/73) May 23 2010 Whenever you need to preserve the previous state of something before

Steven Schveighoffer (11/36) May 24 2010 Scope class members should solve this. It's been thrown around forever ...

bearophile (4/7) May 24 2010 I too have proposed this idea, but in my opinion you can't design a big ...

Steven Schveighoffer (8/17) May 24 2010 Yeah, but essentially, it's an optimization. Whether the storage is

Walter Bright (9/11) May 21 2010 Mike Taylor has a phrase for that I think is well-coined: "impedance mat...

superdan (10/20) May 20 2010 hierarchy, just the interfaces are. I.e. there aren't many kinds of Has...

Steven Schveighoffer (6/27) May 21 2010 I think classes are the right move. First, a collection makes more sens...

superdan (36/63) May 21 2010 reference type. Note that both arrays and associative arrays are refere...

Steven Schveighoffer (5/28) May 21 2010 It doesn't work. Just with your example, you won't get what you expect....

superdan (21/46) May 21 2010 friend. The custom allocator ability in dcollections should provide ple...

BCS (6/16) May 21 2010 That code is broken, if poo is of a class type, that just new's a contai...

Ellery Newcomer (6/7) May 21 2010 Wow. A partially-nullable type.

Ellery Newcomer (2/23) May 21 2010 Or should one just always give an AA param either a const or ref modifie...

Andrei Alexandrescu (30/67) May 21 2010 Without final, they are the roots of a hierarchy. But I understand you

Sean Kelly (5/16) May 21 2010 Tango's container library is or was a port of Doug Lea's containers for ...
Steven Schveighoffer (40/97) May 24 2010 Then you need reference counting, I don't think that is a good design.

Andrei Alexandrescu (27/132) May 24 2010 Why?

Steven Schveighoffer (22/76) May 24 2010 I meant refcounting elements. If you are simply refcounting the
Bruno Medeiros (6/23) May 26 2010 I don't understand this discussion: isn't the reason above pretty much a...

Steven Schveighoffer (10/32) May 27 2010 Only if you wish to have binary compatibility with dynamic libs. Such a...

Jacob Carlborg (9/43) May 28 2010 I've got my patch, for build Tango as a dynamic library on Mac, quite

Steven Schveighoffer (13/51) May 28 2010 I remember that, and I'm very encouraged by it. That being said, the

Jacob Carlborg (6/59) May 28 2010 Yes, exactly, I noticed there isn't an easy way to build dynamic

Bruno Medeiros (8/37) May 28 2010 Ah, nevermind, my mind slipped and I was thinking of any kind of

Steven Schveighoffer (5/43) May 28 2010 I agree, that's why dcollections has interfaces.

Justin Spahr-Summers (10/17) May 24 2010 Cocoa (NeXT's/Apple's framework for Objective-C) uses a very successful

Alex Makhotin (16/19) May 21 2010 Hi, Steven.
Vladimir Panteleev (15/16) May 21 2010 Does that imply that the most important methods are virtual?

BCS (6/16) May 21 2010 From a technical standpoint there is no reason that a method needs to be...
Steven Schveighoffer (5/18) May 24 2010 With the future update that all the classes are final, then they are not...

Michel Fortin (36/43) May 19 2010 Are you sure this is necessary? I'm wondering how the above is different...

Andrei Alexandrescu (13/52) May 21 2010 It's different in that messWith is a type-parameterized function (aka a

Walter Bright (8/15) May 21 2010 I agree with Michael Fortin that the | is questionable. I'd like to sugg...

Andrei Alexandrescu (9/24) May 21 2010 Well the problem stays: compound interfaces grow combinatorially with

Bernard Helyer (3/3) May 19 2010 Oooohhh goody goody goody! n_n

Bernard Helyer (3/6) May 19 2010 ArrayList doesn't compile with warnings as it overrides opEquals, but

Bernard Helyer (4/12) May 19 2010 And lines 772 and 780 complained about not being able to implicitly cast...

Bernard Helyer (3/16) May 19 2010 Sorry about the blow by blow, but the cursor thing seems to work well in...

Steven Schveighoffer (6/14) May 24 2010 http://www.dsource.org/projects/dcollections/ticket/5

Steven Schveighoffer (13/48) May 21 2010 void foo(int[int] x)

Bill Baxter (24/71) May 21 2010 le

BCS (5/25) May 21 2010 Maybe the style rule should be: dynamic arrays and AA's should be passed...

bearophile (4/6) May 22 2010 Something like that must be enforced by the compiler, or the design of a...

Pelle (8/14) May 22 2010 It's not a very good rule anyway:

bearophile (15/24) May 22 2010 You are wrong, it works correctly with ref:

Pelle (3/6) May 22 2010 Yes, it works, but it doesn't gain anything from it, which is what I

bearophile (4/6) May 22 2010 Then what you have said was useless.

Pelle (10/16) May 22 2010 How so? Passing by reference is slower, and sometimes completely
Ellery Newcomer (13/17) May 22 2010 No it isn't. The point Pelle was making is that there are three use

Andrei Alexandrescu (4/4) May 21 2010 I'll now slowly answer the great replies I got in this thread, in

BLS (16/20) May 21 2010 Following the dcollection thread, it seems that your argument is that

BCS (8/10) May 21 2010 How about this: Make a flat family of collections. Name the methods so t...

superdan (5/53) May 21 2010 The container itself is managed by the GC.

Bill Baxter (4/16) May 21 2010 phobos.

BLS (6/6) May 23 2010 Fantastic work Steve,

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

After much work and toil, I have created dcollections for D2.  I think I  
can say that this is the first collection package available for D2.  I  
still hold hope that it will be used as the standard collection package  
for phobos for D2.

Please visit http://www.dsource.org/projects/dcollections for details.

Some VERY IMPORTANT notes for this release:

1. it is very beta software.  Although the implementation has changed very  
little from the 1.0 version, not everything will properly compile.  I have  
filed several compiler bugs to fix some problems I had while porting, and  
inserted workarounds for things I could work around.  Please use the  
LATEST compiler (currently 2.046) because some very important features  
have been added that dcollections relies on.  Although it is beta, the  
algorithms and implementation is largely unchanged so functionality should  
be pretty solid.

2. the docs are not online for d2.  Unfortunately, dsource's  
auto-doc-generator that I was using for the 1.0 version is based on a 1.0  
compiler, so it will not automatically generate the d2 docs.  However, I  
have included some basic ddoc generated docs in the download package,  
don't expect them to be very fun to use :)

3. The docs are not fully updated, so they might be just flat out wrong!   
I plan to update the docs completely for the next beta release.

So, for the good stuff... here are some notes for differences between 1.0  
and 2.0:

     * Ranges!
     * Filled out slicing for all containers
     * Cursors are now non-movable.  Use ranges for safe iteration.
     * The interfaces have been cut down significantly. The question I  
asked myself when deciding whether I wanted to keep an interface is "would  
anyone use this interface?"
     * Functions that should be fast but can be slow (O(n)) have been  
removed from all interfaces, and in most cases, from the containers.  
Notably missing is searching a non-quick lookup container for a value. Use  
std.algorithm.find for that.
     * ArrayMultiset has been removed -- it's complexity functions did not  
fit with the multiset requirements (specifically, quick lookup of an  
element's presence).
     * ArrayList slicing now simply returns a range instead of a "live"  
slice.  Note that an ArrayList range is aliased to a D array.
     * Full unit tests added for all container types.

I have also changed the license from BSD to Boost.


--------

I have also formally released version 0.03 as version 1.0, nothing has  
changed there, including the license.  However, no new features will be  
added to dcollections version 1.0, only bug fixes will be done.

Please, file tickets on dsource for any problems you have.  And let me  
know if there are any ideas you think would be useful.

-Steve

May 19 2010

"Yao G." <nospamyao gmail.com> writes:

Thanks Steven. I was waiting for this for a long time.


On Wed, 19 May 2010 11:09:11 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 After much work and toil, I have created dcollections for D2.  I think I  
 can say that this is the first collection package available for D2.  I  
 still hold hope that it will be used as the standard collection package  
 for phobos for D2.

 Please visit http://www.dsource.org/projects/dcollections for details.

 Some VERY IMPORTANT notes for this release:

 1. it is very beta software.  Although the implementation has changed  
 very little from the 1.0 version, not everything will properly compile.   
 I have filed several compiler bugs to fix some problems I had while  
 porting, and inserted workarounds for things I could work around.   
 Please use the LATEST compiler (currently 2.046) because some very  
 important features have been added that dcollections relies on.   
 Although it is beta, the algorithms and implementation is largely  
 unchanged so functionality should be pretty solid.

 2. the docs are not online for d2.  Unfortunately, dsource's  
 auto-doc-generator that I was using for the 1.0 version is based on a  
 1.0 compiler, so it will not automatically generate the d2 docs.   
 However, I have included some basic ddoc generated docs in the download  
 package, don't expect them to be very fun to use :)

 3. The docs are not fully updated, so they might be just flat out  
 wrong!  I plan to update the docs completely for the next beta release.

 So, for the good stuff... here are some notes for differences between  
 1.0 and 2.0:

      * Ranges!
      * Filled out slicing for all containers
      * Cursors are now non-movable.  Use ranges for safe iteration.
      * The interfaces have been cut down significantly. The question I  
 asked myself when deciding whether I wanted to keep an interface is  
 "would anyone use this interface?"
      * Functions that should be fast but can be slow (O(n)) have been  
 removed from all interfaces, and in most cases, from the containers.  
 Notably missing is searching a non-quick lookup container for a value.  
 Use std.algorithm.find for that.
      * ArrayMultiset has been removed -- it's complexity functions did  
 not fit with the multiset requirements (specifically, quick lookup of an  
 element's presence).
      * ArrayList slicing now simply returns a range instead of a "live"  
 slice.  Note that an ArrayList range is aliased to a D array.
      * Full unit tests added for all container types.

 I have also changed the license from BSD to Boost.


 --------

 I have also formally released version 0.03 as version 1.0, nothing has  
 changed there, including the license.  However, no new features will be  
 added to dcollections version 1.0, only bug fixes will be done.

 Please, file tickets on dsource for any problems you have.  And let me  
 know if there are any ideas you think would be useful.

 -Steve


-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

May 19 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

Are the collections supposed to not have isEmpty members?

Looking forward to using dcollections, and here's to hoping they make it 
into phobos.

OTish: What are your thoughts on a bimap implementation and a 
child/sibling or general tree implementation as part of dcollections?

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

No.  Use length == 0.  O(1) length is always supported for all collections.

 OTish: What are your thoughts on a bimap implementation and a 
 child/sibling or general tree implementation as part of dcollections?

I haven't the slightest what a bimap is :)  I am not an expert in collections
or data structures, I just reimplement things I have understood.  My
implementations are basically copied from my algorithm book, and refined as
much as I can do.

That being said, if you have any implementation of a tree or hash, it should be
easy to insert into dcollections.  If you have ideas for other collection types
(i.e. other than Map, Set, Multiset or List), then I can look into that if you
point me at an implementation or have one of your own.  I purposefully left out
multi-map because I've never had a huge use for it, and it seemed like a
awkward thing to create an interface for...

-Steve

May 19 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

 No.  Use length == 0.  O(1) length is always supported for all collections.

 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of dcollections?

 I haven't the slightest what a bimap is :)  I am not an expert in collections
or data structures, I just reimplement things I have understood.  My
implementations are basically copied from my algorithm book, and refined as
much as I can do.

I think boost.bimap is where I saw it, though I don't don't use c++.

I think it's a map, with values->keys is also a map

 That being said, if you have any implementation of a tree or hash, it should
be easy to insert into dcollections.  If you have ideas for other collection
types (i.e. other than Map, Set, Multiset or List), then I can look into that
if you point me at an implementation or have one of your own.  I purposefully
left out multi-map because I've never had a huge use for it, and it seemed like
a awkward thing to create an interface for...

 -Steve

I have a simple child/sibling tree implementation which I could probably 
dust off and polish up if you want it. The method for visiting the 
elements is kind of weird, though. And I don't know that it exactly fits 
the mold of a reference container. Maybe with cursors.

Ugh, I just noticed LinkList doesn't work with interfaces.

May 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

 No.  Use length == 0.  O(1) length is always supported for all
 collections.

One thing before I forget: I think any good collection abstraction must
be concretized back to the classic collection instances. Singly-linked
lists definitely can't be left off the list! It would be an epic 
failure. Imagine the papers! New York Times: "D has containers, but no 
singly-linked lists". The New Yorker: "D's abstractions are too 
abstract". The National Enquirer: "The Bat Boy stole D's singly-linked 
lists". Pyongyang Times: "Another failure of the so-called Western 
Democracy -- yet Juche doesn't need singly-linked lists anyway."

Keeping the length cached in a singly-linked list is a costly mistake.
It works against splicing (an important list primitive) and most of the
time you don't need it so why waste time updating it. Adding any cruft
beyond { T payload; List* next; } is very strong incentive for the coder
to write their own. Why do you think they're using an SLL in the first
place? Because it's simple and has efficient primitives!

 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of
 dcollections?

 I haven't the slightest what a bimap is :)  I am not an expert in
 collections or data structures, I just reimplement things I have
 understood.  My implementations are basically copied from my
 algorithm book, and refined as much as I can do.

 That being said, if you have any implementation of a tree or hash, it
 should be easy to insert into dcollections.  If you have ideas for
 other collection types (i.e. other than Map, Set, Multiset or List),
 then I can look into that if you point me at an implementation or
 have one of your own.  I purposefully left out multi-map because I've
 never had a huge use for it, and it seemed like a awkward thing to
 create an interface for...

Tries of various kinds come up again and again. I don't think 
dcollections' current abstractions capture them, which further supports 
my point that containers have too much personality to be caught in the 
straitjacket of class hierarchies.


Andrei

May 22 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 Keeping the length cached in a singly-linked list is a costly mistake.
 It works against splicing (an important list primitive) and most of the
 time you don't need it so why waste time updating it.

... LinkedList(T, bool WithLength=True) {
  static if(WithLength) {
    // defines   attribute length here etc
  }
}
Inside the methods like the add there is another static if(WithLength) that
enables/disables the code that updates the length.

This allows to keep the length by default, but makes it easy to remove it when
desired.

Bye,
bearophile

May 22 2010

Sean Kelly <sean invisibleduck.org> writes:

Andrei Alexandrescu Wrote:
 
 One thing before I forget: I think any good collection abstraction must
 be concretized back to the classic collection instances. Singly-linked
 lists definitely can't be left off the list! It would be an epic 
 failure. Imagine the papers! New York Times: "D has containers, but no 
 singly-linked lists".

We could always say we were following C++'s lead :-)

May 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/22/2010 09:07 PM, Sean Kelly wrote:
 Andrei Alexandrescu Wrote:
 One thing before I forget: I think any good collection abstraction must
 be concretized back to the classic collection instances. Singly-linked
 lists definitely can't be left off the list! It would be an epic
 failure. Imagine the papers! New York Times: "D has containers, but no
 singly-linked lists".

 We could always say we were following C++'s lead :-)

C++(0|1)x has forward_list. Needless to say, supporting size() at 
constant complexity was a matter of huge debate. The proposal that I 
like is this:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2543.htm

Search the page for "size()".

I don't know how the proposal was amended.


Andrei

May 22 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 22 May 2010 16:58:12 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:

 Are the collections supposed to not have isEmpty members?

 No.  Use length == 0.  O(1) length is always supported for all
 collections.

 One thing before I forget: I think any good collection abstraction must
 be concretized back to the classic collection instances. Singly-linked
 lists definitely can't be left off the list! It would be an epic  
 failure. Imagine the papers! New York Times: "D has containers, but no  
 singly-linked lists". The New Yorker: "D's abstractions are too  
 abstract". The National Enquirer: "The Bat Boy stole D's singly-linked  
 lists". Pyongyang Times: "Another failure of the so-called Western  
 Democracy -- yet Juche doesn't need singly-linked lists anyway."

 Keeping the length cached in a singly-linked list is a costly mistake.
 It works against splicing (an important list primitive) and most of the
 time you don't need it so why waste time updating it. Adding any cruft
 beyond { T payload; List* next; } is very strong incentive for the coder
 to write their own. Why do you think they're using an SLL in the first
 place? Because it's simple and has efficient primitives!

length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't  
possible, but all dcollections define length.  In that case, you can do  
coll.begin.empty to determine if the collection has any elements.

But all dcollections are bi-directional anyways, there is no singly linked  
list.  That was a decision I made early on, because it allows more  
assumptions about dcollections' containers.  I think length-less singly  
linked lists would be a good addition to phobos that are not part of the  
collection hierarchy, they are almost on par with builtin arrays as being  
so simple.

And singly linked vs. doubly linked does not make any difference whether  
O(1) length is possible or not.  As you say, it's O(1) splicing or O(1)  
length, regardless of single or double links.

I disagree with your assessment that length is a less used operation than  
splicing.  I think I have never used splicing with std::list.  C++'s  
default is O(1) length, and I followed that design.

 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of
 dcollections?

 I haven't the slightest what a bimap is :)  I am not an expert in
 collections or data structures, I just reimplement things I have
 understood.  My implementations are basically copied from my
 algorithm book, and refined as much as I can do.

 That being said, if you have any implementation of a tree or hash, it
 should be easy to insert into dcollections.  If you have ideas for
 other collection types (i.e. other than Map, Set, Multiset or List),
 then I can look into that if you point me at an implementation or
 have one of your own.  I purposefully left out multi-map because I've
 never had a huge use for it, and it seemed like a awkward thing to
 create an interface for...

 Tries of various kinds come up again and again. I don't think  
 dcollections' current abstractions capture them, which further supports  
 my point that containers have too much personality to be caught in the  
 straitjacket of class hierarchies.

I am not familiar with tries, and dcollections has no class hierarchy.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

Do you agree that's an awkwardness?

 but all dcollections define length.

I was hoping we're on the verge of agreeing to yank all interfaces and 
let the collection roam free. If that's a possibility, then your design 
isn't constrained to define length anymore unless in can.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

Then why not make length optional and define empty? From the above it 
looks like an obviously better design.

 But all dcollections are bi-directional anyways, there is no singly
 linked list.  That was a decision I made early on, because it allows more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

Do you see dcollections' inability to express slists as a problem?

 And singly linked vs. doubly linked does not make any difference whether
 O(1) length is possible or not. As you say, it's O(1) splicing or O(1)
 length, regardless of single or double links.

 I disagree with your assessment that length is a less used operation
 than splicing. I think I have never used splicing with std::list. C++'s
 default is O(1) length, and I followed that design.

I didn't assess that. My main point was that if one defines an slist, in 
many cases one defines it to be very small, simple, and cheap. 
Maintaining size will come on the radar and would discourage the use of 
the abstraction in favor of a hand-rolled implementation. This is 
failure to abstract.

 OTish: What are your thoughts on a bimap implementation and a
 child/sibling or general tree implementation as part of
 dcollections?

 I haven't the slightest what a bimap is :) I am not an expert in
 collections or data structures, I just reimplement things I have
 understood. My implementations are basically copied from my
 algorithm book, and refined as much as I can do.

 That being said, if you have any implementation of a tree or hash, it
 should be easy to insert into dcollections. If you have ideas for
 other collection types (i.e. other than Map, Set, Multiset or List),
 then I can look into that if you point me at an implementation or
 have one of your own. I purposefully left out multi-map because I've
 never had a huge use for it, and it seemed like a awkward thing to
 create an interface for...

 Tries of various kinds come up again and again. I don't think
 dcollections' current abstractions capture them, which further
 supports my point that containers have too much personality to be
 caught in the straitjacket of class hierarchies.

 I am not familiar with tries, and dcollections has no class hierarchy.

I don't know of a word for "hierarchy with interfaces as root and inner 
nodes and classes as leaf nodes" so I just call that class hierarchy.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

 Do you agree that's an awkwardness?

Yes, at that point it's an optimization.  The only place where I assume  
length could be invalid is when working with the Iterator!V type, which  
might not be a dcollections object.

 but all dcollections define length.

 I was hoping we're on the verge of agreeing to yank all interfaces and  
 let the collection roam free. If that's a possibility, then your design  
 isn't constrained to define length anymore unless in can.

I agree, removing all the interfaces would get rid of the need for  
NO_LENGTH_SUPPORT.  But all dcollections define a valid length, so that  
point is kinda moot :)  However, supporting non-length containers via  
generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

 Then why not make length optional and define empty? From the above it  
 looks like an obviously better design.

 But all dcollections are bi-directional anyways, there is no singly
 linked list.  That was a decision I made early on, because it allows  
 more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

 Do you see dcollections' inability to express slists as a problem?

I don't see them as unable to express slists.  They would break the design  
guidelines that I set for collections being iterable in both directions,  
but an slist fits fine within the List interface.

 And singly linked vs. doubly linked does not make any difference whether
 O(1) length is possible or not. As you say, it's O(1) splicing or O(1)
 length, regardless of single or double links.

 I disagree with your assessment that length is a less used operation
 than splicing. I think I have never used splicing with std::list. C++'s
 default is O(1) length, and I followed that design.

 I didn't assess that.

I felt like you did.  Your statement was: "It works against splicing (an  
important list primitive) and most of the time you don't need it so why  
waste time updating it."

While you didn't specifically say it was used more, you mentioned the  
importance of splicing, and the non-importance of length.  I guess I  
should have said, I disagree that splicing is more important than caching  
length.

 My main point was that if one defines an slist, in many cases one  
 defines it to be very small, simple, and cheap. Maintaining size will  
 come on the radar and would discourage the use of the abstraction in  
 favor of a hand-rolled implementation. This is failure to abstract.

All dcollections own their elements, it is not like an array, where there  
can be multiple references to subsets of the data.  An slist would simply  
be an slist as you describe with a slightly bigger head.  In other words,  
the head node has the length cache and allocator, and object monitor, and  
whatever else you need for the whole list.  The nodes themselves would be  
a simple element and pointer.  But the elements are never exposed except  
via ranges and cursors.  The ranges and cursors cannot affect the topology  
of the list, you need the head "owner" node to do that.

I look at it this way, dcollections are "good enough" for most uses, if  
you need highly tailored data structures with specific uses in mind, you  
can make those on your own.

For example, dcollections' LinkList can easily take the place of a simple  
slist.  It may not be as fast with your specific requirements in mind, but  
it works.  The benefit is, it works like all the other collection types  
and it's easy to learn all of them together because they all have certain  
properties.

 I don't know of a word for "hierarchy with interfaces as root and inner  
 nodes and classes as leaf nodes" so I just call that class hierarchy.

I view them as classes with interfaces tacked on because the  
implementations are similar :)

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

 Do you agree that's an awkwardness?

 Yes, at that point it's an optimization. The only place where I assume
 length could be invalid is when working with the Iterator!V type, which
 might not be a dcollections object.

That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have 
interfaces, the door is ajar forever. Client code can't write this:

auto total = cont1.length + cont2.length;

because that is incorrect code (that compiles, runs, and produces some 
odd result).

So don't take it lightly. NO_LENGTH_SUPPORT means no length support. 
Then pretending it's supported will only harm everyone.

 but all dcollections define length.

 I was hoping we're on the verge of agreeing to yank all interfaces and
 let the collection roam free. If that's a possibility, then your
 design isn't constrained to define length anymore unless in can.

 I agree, removing all the interfaces would get rid of the need for
 NO_LENGTH_SUPPORT. But all dcollections define a valid length, so that
 point is kinda moot :)

Not moot because you have interfaces.

 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

 Then why not make length optional and define empty? From the above it
 looks like an obviously better design.

 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

 Do you see dcollections' inability to express slists as a problem?

 I don't see them as unable to express slists. They would break the
 design guidelines that I set for collections being iterable in both
 directions, but an slist fits fine within the List interface.

What's the required complexity of concat? Is add expected to put things 
in a specific place? How does slist implement back()?

slist does not fit the List interface.

 And singly linked vs. doubly linked does not make any difference whether
 O(1) length is possible or not. As you say, it's O(1) splicing or O(1)
 length, regardless of single or double links.

 I disagree with your assessment that length is a less used operation
 than splicing. I think I have never used splicing with std::list. C++'s
 default is O(1) length, and I followed that design.

 I didn't assess that.

 I felt like you did. Your statement was: "It works against splicing (an
 important list primitive) and most of the time you don't need it so why
 waste time updating it."

 While you didn't specifically say it was used more, you mentioned the
 importance of splicing, and the non-importance of length. I guess I
 should have said, I disagree that splicing is more important than
 caching length.

 My main point was that if one defines an slist, in many cases one
 defines it to be very small, simple, and cheap. Maintaining size will
 come on the radar and would discourage the use of the abstraction in
 favor of a hand-rolled implementation. This is failure to abstract.

 All dcollections own their elements, it is not like an array, where
 there can be multiple references to subsets of the data. An slist would
 simply be an slist as you describe with a slightly bigger head. In other
 words, the head node has the length cache and allocator, and object
 monitor, and whatever else you need for the whole list. The nodes
 themselves would be a simple element and pointer. But the elements are
 never exposed except via ranges and cursors. The ranges and cursors
 cannot affect the topology of the list, you need the head "owner" node
 to do that.

I understand. The problem is when you many such lists or when you 
manipulate a few intensively.

 I look at it this way, dcollections are "good enough" for most uses, if
 you need highly tailored data structures with specific uses in mind, you
 can make those on your own.

 For example, dcollections' LinkList can easily take the place of a
 simple slist. It may not be as fast with your specific requirements in
 mind, but it works. The benefit is, it works like all the other
 collection types and it's easy to learn all of them together because
 they all have certain properties.

That's what STL said about slists. Next thing you know... forward_list 
is being standardized.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 11:45:44 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

 Do you agree that's an awkwardness?

 Yes, at that point it's an optimization. The only place where I assume
 length could be invalid is when working with the Iterator!V type, which
 might not be a dcollections object.

 That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have  
 interfaces, the door is ajar forever. Client code can't write this:

 auto total = cont1.length + cont2.length;

 because that is incorrect code (that compiles, runs, and produces some  
 odd result).

 So don't take it lightly. NO_LENGTH_SUPPORT means no length support.  
 Then pretending it's supported will only harm everyone.

I get what you are saying.  What I meant was it's moot if you are not  
using interfaces.  If you don't know what the underlying type is, then you  
have to do a runtime check.

I agree it's awkward, and I could have not included length as a member of  
Iterator, in which case it would be on all the container types and  
NO_LENGTH_SUPPORT would not exist.  All containers are meant to have a  
valid length, it is only with non-container Iterators that length can be  
NO_LENGTH_SUPPORT.

 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

 Then why not make length optional and define empty? From the above it
 looks like an obviously better design.

 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows  
 more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part of  
 the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

 Do you see dcollections' inability to express slists as a problem?

 I don't see them as unable to express slists. They would break the
 design guidelines that I set for collections being iterable in both
 directions, but an slist fits fine within the List interface.

 What's the required complexity of concat?

O(n) with n being the number of elements added

 Is add expected to put things in a specific place? How does slist  
 implement back()?

 slist does not fit the List interface.

A pointer to the end element would be required for both appending and  
back().

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 11:14 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:45:44 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

 Do you agree that's an awkwardness?

 Yes, at that point it's an optimization. The only place where I assume
 length could be invalid is when working with the Iterator!V type, which
 might not be a dcollections object.

 That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have
 interfaces, the door is ajar forever. Client code can't write this:

 auto total = cont1.length + cont2.length;

 because that is incorrect code (that compiles, runs, and produces some
 odd result).

 So don't take it lightly. NO_LENGTH_SUPPORT means no length support.
 Then pretending it's supported will only harm everyone.

 I get what you are saying. What I meant was it's moot if you are not
 using interfaces. If you don't know what the underlying type is, then
 you have to do a runtime check.

 I agree it's awkward, and I could have not included length as a member
 of Iterator, in which case it would be on all the container types and
 NO_LENGTH_SUPPORT would not exist. All containers are meant to have a
 valid length, it is only with non-container Iterators that length can be
 NO_LENGTH_SUPPORT.

 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

 Then why not make length optional and define empty? From the above it
 looks like an obviously better design.

 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows
 more
 assumptions about dcollections' containers. I think length-less singly
 linked lists would be a good addition to phobos that are not part
 of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

 Do you see dcollections' inability to express slists as a problem?

 I don't see them as unable to express slists. They would break the
 design guidelines that I set for collections being iterable in both
 directions, but an slist fits fine within the List interface.

 What's the required complexity of concat?

 O(n) with n being the number of elements added

 Is add expected to put things in a specific place? How does slist
 implement back()?

 slist does not fit the List interface.

 A pointer to the end element would be required for both appending and
 back().

This further erodes my confidence. Size needs to be maintained _and_ a 
pointer to the last element must be maintained. For many uses, all of 
this stuff is unnecessary overhead.

I think we need to build the shared vision that in Phobos we're 
competing against hand-written, highly-optimized code. This is the fight 
STL took and largely won. We can't afford to lower our standards for one 
tiny bit.

I was once talking over a beer about the advantages of container 
abstractions. Walter slapped my face by defining the fastest SLL in the 
world on a napkin. It looked like this:

struct List { int v; List * next; }

He took so little time to define that, and the primitives he needed were 
so simple and fast, that it was an absolute overkill to replace that 
with a generic whiz-bang container. If I'd mentioned there'd be any 
extra data involved, that would mean the end of negotiation. "Why do you 
need that?" "For length()!" "But I don't need length()!" "Well you have 
to." "That's Communism in design!"

And I agree.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 12:27:10 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 11:14 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:45:44 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:23 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 10:10:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 length is allowed to return NO_LENGTH_SUPPORT if O(1) length isn't
 possible,

 Do you agree that's an awkwardness?

 Yes, at that point it's an optimization. The only place where I assume
 length could be invalid is when working with the Iterator!V type,  
 which
 might not be a dcollections object.

 That doesn't matter. Since you define NO_LENGTH_SUPPORT and you have
 interfaces, the door is ajar forever. Client code can't write this:

 auto total = cont1.length + cont2.length;

 because that is incorrect code (that compiles, runs, and produces some
 odd result).

 So don't take it lightly. NO_LENGTH_SUPPORT means no length support.
 Then pretending it's supported will only harm everyone.

 I get what you are saying. What I meant was it's moot if you are not
 using interfaces. If you don't know what the underlying type is, then
 you have to do a runtime check.

 I agree it's awkward, and I could have not included length as a member
 of Iterator, in which case it would be on all the container types and
 NO_LENGTH_SUPPORT would not exist. All containers are meant to have a
 valid length, it is only with non-container Iterators that length can be
 NO_LENGTH_SUPPORT.

 However, supporting non-length containers via
 generic programming vs. interfaces would be much smoother.

 In that case, you can do
 coll.begin.empty to determine if the collection has any elements.

 Then why not make length optional and define empty? From the above it
 looks like an obviously better design.

 But all dcollections are bi-directional anyways, there is no singly
 linked list. That was a decision I made early on, because it allows
 more
 assumptions about dcollections' containers. I think length-less  
 singly
 linked lists would be a good addition to phobos that are not part
 of the
 collection hierarchy, they are almost on par with builtin arrays as
 being so simple.

 Do you see dcollections' inability to express slists as a problem?

 I don't see them as unable to express slists. They would break the
 design guidelines that I set for collections being iterable in both
 directions, but an slist fits fine within the List interface.

 What's the required complexity of concat?

 O(n) with n being the number of elements added

 Is add expected to put things in a specific place? How does slist
 implement back()?

 slist does not fit the List interface.

 A pointer to the end element would be required for both appending and
 back().

 This further erodes my confidence. Size needs to be maintained _and_ a  
 pointer to the last element must be maintained. For many uses, all of  
 this stuff is unnecessary overhead.

You make it sound like incrementing a counter and changing a pointer are  
incredibly slow operations :)  The overhead of storage is minimal compared  
to the elements of the list, which do not contain said overhead.

 I think we need to build the shared vision that in Phobos we're  
 competing against hand-written, highly-optimized code. This is the fight  
 STL took and largely won. We can't afford to lower our standards for one  
 tiny bit.

 I was once talking over a beer about the advantages of container  
 abstractions. Walter slapped my face by defining the fastest SLL in the  
 world on a napkin. It looked like this:

 struct List { int v; List * next; }

 He took so little time to define that, and the primitives he needed were  
 so simple and fast, that it was an absolute overkill to replace that  
 with a generic whiz-bang container. If I'd mentioned there'd be any  
 extra data involved, that would mean the end of negotiation. "Why do you  
 need that?" "For length()!" "But I don't need length()!" "Well you have  
 to." "That's Communism in design!"

 And I agree.

But something like the above is prone to all sorts of nasty things, like  
inadvertent cycles in your list.  Which may be acceptable to you if you  
want the bleeding fastest speed available.  There will always be tailored  
code crafted to be as fast as possible for some specific design and  
dcollections and your slist just won't fit the bill.

And dcollections' link list node is exactly what you wrote there (with a  
prev pointer of course).  The only difference is, I don't define an  
element is the same as a list.  The whole list has it's own data,  
including a pointer to the first element in the list.

Look at D's arrays.  Is appending with D's arrays the fastest it possibly  
could be?  Hell no, but it's good enough for most situations, and safe.

-Steve

May 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 Look at D's arrays.  Is appending with D's arrays the fastest it possibly  
 could be?  Hell no, but it's good enough for most situations, and safe.

Append in D dynamic arrays is awful, it's slow and requires complex code (and
currently there are not explanations about how it works in the D docs, see
http://d.puremagic.com/issues/show_bug.cgi?id=4091 ), so it's not a good
example to be copied :-) It's not fast enough for most situations, that's why
there is the need of an Appender still.

Bye,
bearophile

May 24 2010

bearophile <bearophileHUGS lycos.com> writes:

And the appending is hard to use too, see the ridiculously complex to use &
messy capacity, reserve and  and assumeSafeAppend. So overall it's a good
example of what I will never want to copy.

Bye,
bearophile

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 15:32:01 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 And the appending is hard to use too, see the ridiculously complex to  
 use & messy capacity, reserve and  and assumeSafeAppend. So overall it's  
 a good example of what I will never want to copy.

That part is still pretty new.  What do you find is messy and complex  
about it?

When I said "good enough for most situations" I didn't mean "most  
situations where appending speed is crucial to the success of the  
application" :P

Most situations means you need to do appending once in a while.  A perfect  
example is something like toStringz.  And most situations do not require  
the three above mentioned functions.

-Steve

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 15:29:23 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 Look at D's arrays.  Is appending with D's arrays the fastest it  
 possibly
 could be?  Hell no, but it's good enough for most situations, and safe.

 Append in D dynamic arrays is awful, it's slow and requires complex code.

complex code?

a ~= b;

Seems pretty not complex.

-Steve

May 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 complex code?
 a ~= b;
 Seems pretty not complex.

I meant on the implementation side. Arrays are used all the time, and they are
basic blocks to be used everywhere so a programmer probably must understand how
they work inside.


 That part is still pretty new.  What do you find is messy and complex  
 about it?

The internal semantics of the current design of dynamic arrays is not
transparent enough, this means I often don't know what it is doing. This can be
acceptable for a more complex data structure as the associative arrays.
And it's not easy to know exactly where to use assumeSafeAppend.
Despite their complexity, D dynamic arrays are not safe enough, see the small
thread about passing D dynamic arrays by reference on default.


 When I said "good enough for most situations" I didn't mean "most  
 situations where appending speed is crucial to the success of the  
 application" :P

I have written few hundreds of small D programs, and I have seen that appending
often happens (more often that I have originally thought) in points where
performance is important enough. Using the Appender of my dlibs was usually
able to solve the problem.

I don't know better solutions, so maybe I'm just wasting your time, I am sorry.
Before seeing D I have never thought that designing good dynamic arrays can be
so hard :-)

Bye,
bearophile

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 02:29 PM, bearophile wrote:
 Steven Schveighoffer:
 Look at D's arrays.  Is appending with D's arrays the fastest it possibly
 could be?  Hell no, but it's good enough for most situations, and safe.

 Append in D dynamic arrays is awful, it's slow and requires complex code (and
currently there are not explanations about how it works in the D docs, see
http://d.puremagic.com/issues/show_bug.cgi?id=4091 ), so it's not a good
example to be copied :-) It's not fast enough for most situations, that's why
there is the need of an Appender still.

When was the last time you measured? I thought the speed has largely 
improved since Steve integrated his work.

Andrei

May 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:
 When was the last time you measured? I thought the speed has largely 
 improved since Steve integrated his work.

I have timed it after that integration. I have seen a performance improvement,
but it's small. I can perform some syntactic benchmarks.

Later,
bearophile

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 16:35:03 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Andrei Alexandrescu:
 When was the last time you measured? I thought the speed has largely
 improved since Steve integrated his work.

 I have timed it after that integration. I have seen a performance  
 improvement, but it's small. I can perform some syntactic benchmarks.

Performance was vastly improved for situations where one was appending to  
multiple arrays at once.  For appending to a single array in a loop, it is  
improved, but not that much.

But the main improvement was for safety.  The append operation on D's  
dynamic arrays is a tradeoff between footprint, speed, and safety.  It  
also does not and should not compromise performance on operations besides  
appending.

You will always be able to outperform the D append operation with more  
focus on the append operation, but D's array appending is fine for most  
situations.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 01:49 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 12:27:10 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 A pointer to the end element would be required for both appending and
 back().

 This further erodes my confidence. Size needs to be maintained _and_ a
 pointer to the last element must be maintained. For many uses, all of
 this stuff is unnecessary overhead.

 You make it sound like incrementing a counter and changing a pointer are
 incredibly slow operations :)

They are. Ask Walter for a good example.

 The overhead of storage is minimal
 compared to the elements of the list, which do not contain said overhead.

I'm talking containers of lists here.

 I think we need to build the shared vision that in Phobos we're
 competing against hand-written, highly-optimized code. This is the
 fight STL took and largely won. We can't afford to lower our standards
 for one tiny bit.

 I was once talking over a beer about the advantages of container
 abstractions. Walter slapped my face by defining the fastest SLL in
 the world on a napkin. It looked like this:

 struct List { int v; List * next; }

 He took so little time to define that, and the primitives he needed
 were so simple and fast, that it was an absolute overkill to replace
 that with a generic whiz-bang container. If I'd mentioned there'd be
 any extra data involved, that would mean the end of negotiation. "Why
 do you need that?" "For length()!" "But I don't need length()!" "Well
 you have to." "That's Communism in design!"

 And I agree.

 But something like the above is prone to all sorts of nasty things, like
 inadvertent cycles in your list.

Yeah, and I'd have my list of arguments to add too.

 Which may be acceptable to you if you
 want the bleeding fastest speed available. There will always be tailored
 code crafted to be as fast as possible for some specific design and
 dcollections and your slist just won't fit the bill.

Except that the set is considerably smaller for a well-designed slist.

 And dcollections' link list node is exactly what you wrote there (with a
 prev pointer of course). The only difference is, I don't define an
 element is the same as a list. The whole list has it's own data,
 including a pointer to the first element in the list.

 Look at D's arrays. Is appending with D's arrays the fastest it possibly
 could be? Hell no, but it's good enough for most situations, and safe.

If I could make them faster without adversely affecting the performance 
of other operations, I would. There's considerable constraints at plain 
in the array design for historical and other reasons.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 16:07:22 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 01:49 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 12:27:10 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 A pointer to the end element would be required for both appending and
 back().

 This further erodes my confidence. Size needs to be maintained _and_ a
 pointer to the last element must be maintained. For many uses, all of
 this stuff is unnecessary overhead.

 You make it sound like incrementing a counter and changing a pointer are
 incredibly slow operations :)

 They are. Ask Walter for a good example.

Actually, I realize that pointing to the end node isn't good enough,  
because once you remove that node, there is no way to get to the previous  
node without traversing the list again.

So slist cannot implement the List interface, you were right.

 The overhead of storage is minimal
 compared to the elements of the list, which do not contain said  
 overhead.

 I'm talking containers of lists here.

I was more talking about the ratio of elements to lists.  For example, the  
Hash implementation uses the same link list nodes as elements in its  
table, and generally a hash has very few elements per list, so overhead on  
the list head would be too much.  On top of that, a LinkList has its own  
allocator, meaning each one consumes a large chunk of data assuming you  
will be adding lots of elements to it.

So maybe we need to refine the implementation building blocks that back  
the dcollections classes so they are useable in special situations where  
performance is critical.  It actually would be pretty trivial to define  
SLink as a specialization of dcollections' Link struct, and that would  
essentially give you a lot of functionality in terms of your ideal list  
type.

 Which may be acceptable to you if you
 want the bleeding fastest speed available. There will always be tailored
 code crafted to be as fast as possible for some specific design and
 dcollections and your slist just won't fit the bill.

 Except that the set is considerably smaller for a well-designed slist.

The set of problems that an slist solves is also considerably smaller.   
Having that backwards capability enables more algorithms.

-Steve

May 24 2010

Pelle <pelle.mansson gmail.com> writes:

On 05/24/2010 06:27 PM, Andrei Alexandrescu wrote:
 struct List { int v; List * next; }

While I do agree with that design for a list, that is no reference type.

I thought we wanted reference types.

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 02:01 PM, Pelle wrote:
 On 05/24/2010 06:27 PM, Andrei Alexandrescu wrote:
 struct List { int v; List * next; }

 While I do agree with that design for a list, that is no reference type.

List* is.

My point was that the pressure for a really simple hand-rolled SLL is 
huge. A good abstraction should convince the Walters of this world that 
the abstraction is actually better than the hand-rolled SLL.

Andrei

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

Missed that upon the first read. I suggest you look at tries and the 
following other structures as good examples that it's futile to fit 
collections into hierarchies.

http://en.wikipedia.org/wiki/Trie
http://linux.thai.net/~thep/datrie/datrie.html
http://en.wikipedia.org/wiki/Suffix_tree
http://en.wikipedia.org/wiki/Kd-tree

We'd want to implement in time those and many more in Phobos without 
worrying that some of their primitives won't fit the existing 
interfaces, and also without the unjustified effort of formalizing 
interfaces for each of them in thinking that another very, very similar 
container will come along.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 11:21:20 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

 Missed that upon the first read. I suggest you look at tries and the  
 following other structures as good examples that it's futile to fit  
 collections into hierarchies.

 http://en.wikipedia.org/wiki/Trie
 http://linux.thai.net/~thep/datrie/datrie.html
 http://en.wikipedia.org/wiki/Suffix_tree
 http://en.wikipedia.org/wiki/Kd-tree

 We'd want to implement in time those and many more in Phobos without  
 worrying that some of their primitives won't fit the existing  
 interfaces, and also without the unjustified effort of formalizing  
 interfaces for each of them in thinking that another very, very similar  
 container will come along.

 From a cursory look, I don't see why tries would not be possible in  
dcollections.

I'd probably start with something like this:

class TrieMap(K, V) : Map!(K[], V)

The array brackets enforce the ability to use prefixes on the keys.

-Stvee

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 10:39 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:21:20 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

 Missed that upon the first read. I suggest you look at tries and the
 following other structures as good examples that it's futile to fit
 collections into hierarchies.

 http://en.wikipedia.org/wiki/Trie
 http://linux.thai.net/~thep/datrie/datrie.html
 http://en.wikipedia.org/wiki/Suffix_tree
 http://en.wikipedia.org/wiki/Kd-tree

 We'd want to implement in time those and many more in Phobos without
 worrying that some of their primitives won't fit the existing
 interfaces, and also without the unjustified effort of formalizing
 interfaces for each of them in thinking that another very, very
 similar container will come along.

  From a cursory look, I don't see why tries would not be possible in
 dcollections.

 I'd probably start with something like this:

 class TrieMap(K, V) : Map!(K[], V)

 The array brackets enforce the ability to use prefixes on the keys.

The key point of tries is that they have distributed storage. Thus, 
shoehorning them into the interface function

Iterator!(K[]) keys();

forces allocation and copying.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 11:51:39 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:39 AM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 11:21:20 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 06:54 AM, Steven Schveighoffer wrote:
 I am not familiar with tries,

 Missed that upon the first read. I suggest you look at tries and the
 following other structures as good examples that it's futile to fit
 collections into hierarchies.

 http://en.wikipedia.org/wiki/Trie
 http://linux.thai.net/~thep/datrie/datrie.html
 http://en.wikipedia.org/wiki/Suffix_tree
 http://en.wikipedia.org/wiki/Kd-tree

 We'd want to implement in time those and many more in Phobos without
 worrying that some of their primitives won't fit the existing
 interfaces, and also without the unjustified effort of formalizing
 interfaces for each of them in thinking that another very, very
 similar container will come along.

  From a cursory look, I don't see why tries would not be possible in
 dcollections.

 I'd probably start with something like this:

 class TrieMap(K, V) : Map!(K[], V)

 The array brackets enforce the ability to use prefixes on the keys.

 The key point of tries is that they have distributed storage. Thus,  
 shoehorning them into the interface function

 Iterator!(K[]) keys();

 forces allocation and copying.

I wasn't thinking of that, I was thinking the key point of tries being  
more efficient at lookup for strings of things.

One solution is that Trie could be a separate interface (i.e. a sibling of  
Map).

One thing to point out is that D's arrays are adept at appending and  
prefixing.  A K[] key would not necessarily be reallocating for each node  
traversal, but it certainly would be copying data.

How would you define a way to get all the keys of a Trie?  Or would you  
simply leave that function off?  Is it unreasonable to expect to be able  
to iterate the keys in a trie?  (I don't really know, I've never worked  
with them)

-Steve

May 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 Is it unreasonable to expect to be able  
 to iterate the keys in a trie?  (I don't really know, I've never worked  
 with them)

On tries you can iterate on all keys or on a sorted range of the keys.

Bye,
bearophile

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie?  (I don't really know, I've never worked
 with them)

 On tries you can iterate on all keys or on a sorted range of the keys.

There's one more way of iteration that's unique to tries - by key 
element (e.g. if key is string, you iterate by character). That would be 
an exploratory iteration.

At each step in the iteration you pass a character, so popFront() takes 
an argument, it's popFront(char). trie.popFront('a') takes you to all 
elements in the trie that start with 'a'. Then you can continue 
exploring by e.g. trie.popFront('x') which takes you to all elements 
that start with "ax". At any point the range may become dry (nothing 
with this prefix). If the range is not empty, front() gives you 
information about all strings that share the particular prefix you have 
spanned (count and breadth comes to mind).


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie?  (I don't really know, I've never worked
 with them)

 On tries you can iterate on all keys or on a sorted range of the keys.

 There's one more way of iteration that's unique to tries - by key  
 element (e.g. if key is string, you iterate by character). That would be  
 an exploratory iteration.

 At each step in the iteration you pass a character, so popFront() takes  
 an argument, it's popFront(char). trie.popFront('a') takes you to all  
 elements in the trie that start with 'a'. Then you can continue  
 exploring by e.g. trie.popFront('x') which takes you to all elements  
 that start with "ax". At any point the range may become dry (nothing  
 with this prefix). If the range is not empty, front() gives you  
 information about all strings that share the particular prefix you have  
 spanned (count and breadth comes to mind).

That could be supported.

But iterating over the keys, is that not advisable?  Would your trie  
iterator be the only way to iterate over the elements?  It seems rather  
non-useful.

Or is there another way to iterating the keys that would be more efficient  
than building an array to contain the 'key' for each node?

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 01:12 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie? (I don't really know, I've never worked
 with them)

 On tries you can iterate on all keys or on a sorted range of the keys.

 There's one more way of iteration that's unique to tries - by key
 element (e.g. if key is string, you iterate by character). That would
 be an exploratory iteration.

 At each step in the iteration you pass a character, so popFront()
 takes an argument, it's popFront(char). trie.popFront('a') takes you
 to all elements in the trie that start with 'a'. Then you can continue
 exploring by e.g. trie.popFront('x') which takes you to all elements
 that start with "ax". At any point the range may become dry (nothing
 with this prefix). If the range is not empty, front() gives you
 information about all strings that share the particular prefix you
 have spanned (count and breadth comes to mind).

 That could be supported.

To paraphrase a commercial, "there's an interface for that" :o).

 But iterating over the keys, is that not advisable? Would your trie
 iterator be the only way to iterate over the elements? It seems rather
 non-useful.

I think a trie, like many other nontrivial containers, supports several 
means of iteration.

 Or is there another way to iterating the keys that would be more
 efficient than building an array to contain the 'key' for each node?

I don't know. I'm just glad I can explore that without having to think 
in interfaces.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 15:58:46 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 01:12 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie? (I don't really know, I've never  
 worked
 with them)

 On tries you can iterate on all keys or on a sorted range of the keys.

 There's one more way of iteration that's unique to tries - by key
 element (e.g. if key is string, you iterate by character). That would
 be an exploratory iteration.

 At each step in the iteration you pass a character, so popFront()
 takes an argument, it's popFront(char). trie.popFront('a') takes you
 to all elements in the trie that start with 'a'. Then you can continue
 exploring by e.g. trie.popFront('x') which takes you to all elements
 that start with "ax". At any point the range may become dry (nothing
 with this prefix). If the range is not empty, front() gives you
 information about all strings that share the particular prefix you
 have spanned (count and breadth comes to mind).

 That could be supported.

 To paraphrase a commercial, "there's an interface for that" :o).

No there isn't.  I wouldn't make that an interface, as it's not  
foreachable.  I'd build a special range for it.

 But iterating over the keys, is that not advisable? Would your trie
 iterator be the only way to iterate over the elements? It seems rather
 non-useful.

 I think a trie, like many other nontrivial containers, supports several  
 means of iteration.

Including keys?!!!

Still not a straight answer.  If you should be able to iterate keys, then  
say you should be able to iterate keys.  If you shouldn't, then say that.

 Or is there another way to iterating the keys that would be more
 efficient than building an array to contain the 'key' for each node?

 I don't know. I'm just glad I can explore that without having to think  
 in interfaces.

I don't really know what this means.  The interfaces are a way to plug  
containers in at runtime.  They are not meant to be the entire API for the  
collection.  You can absolutely add whatever functionality that does not  
fit in the interface as an additional feature, and users will have to use  
the exact type in order to use that feature.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 03:18 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 15:58:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 01:12 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 13:36:01 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 12:11 PM, bearophile wrote:
 Steven Schveighoffer:
 Is it unreasonable to expect to be able
 to iterate the keys in a trie? (I don't really know, I've never
 worked
 with them)

 On tries you can iterate on all keys or on a sorted range of the keys.

 There's one more way of iteration that's unique to tries - by key
 element (e.g. if key is string, you iterate by character). That would
 be an exploratory iteration.

 At each step in the iteration you pass a character, so popFront()
 takes an argument, it's popFront(char). trie.popFront('a') takes you
 to all elements in the trie that start with 'a'. Then you can continue
 exploring by e.g. trie.popFront('x') which takes you to all elements
 that start with "ax". At any point the range may become dry (nothing
 with this prefix). If the range is not empty, front() gives you
 information about all strings that share the particular prefix you
 have spanned (count and breadth comes to mind).

 That could be supported.

 To paraphrase a commercial, "there's an interface for that" :o).

 No there isn't. I wouldn't make that an interface, as it's not
 foreachable. I'd build a special range for it.

 But iterating over the keys, is that not advisable? Would your trie
 iterator be the only way to iterate over the elements? It seems rather
 non-useful.

 I think a trie, like many other nontrivial containers, supports
 several means of iteration.

 Including keys?!!!

 Still not a straight answer. If you should be able to iterate keys, then
 say you should be able to iterate keys. If you shouldn't, then say that.

Sorry. Yes, by-key iteration should be possible.

Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

OK, so we should be able to iterate keys.  And the keys are not stored in  
the trie structure itself.  So how does one iterate the keys of the  
container without reconstructing them from the trie nodes using the heap?   
Forget about interfaces, pretend you were writing it separate from  
dcollections.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

 OK, so we should be able to iterate keys. And the keys are not stored in
 the trie structure itself. So how does one iterate the keys of the
 container without reconstructing them from the trie nodes using the
 heap?

You can't. At some point you need to copy tree traces into T[] arrays. 
If the trie stores parent nodes, you don't need to do that as you can 
expose a trace as a double-ended range containing the key.

There's a kind of trie that was defined to avoid such issues; it stores 
the keys in clear, in the leaves, at the expense of duplication. I don't 
remember the name offhand. Does anyone?


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 17:35:11 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

 OK, so we should be able to iterate keys. And the keys are not stored in
 the trie structure itself. So how does one iterate the keys of the
 container without reconstructing them from the trie nodes using the
 heap?

 You can't. At some point you need to copy tree traces into T[] arrays.  
 If the trie stores parent nodes, you don't need to do that as you can  
 expose a trace as a double-ended range containing the key.

 There's a kind of trie that was defined to avoid such issues; it stores  
 the keys in clear, in the leaves, at the expense of duplication. I don't  
 remember the name offhand. Does anyone?

OK, so the keys function of Map should work just fine for a Trie  
implementation.  Good to know.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 04:38 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 17:35:11 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

 OK, so we should be able to iterate keys. And the keys are not stored in
 the trie structure itself. So how does one iterate the keys of the
 container without reconstructing them from the trie nodes using the
 heap?

 You can't. At some point you need to copy tree traces into T[] arrays.
 If the trie stores parent nodes, you don't need to do that as you can
 expose a trace as a double-ended range containing the key.

 There's a kind of trie that was defined to avoid such issues; it
 stores the keys in clear, in the leaves, at the expense of
 duplication. I don't remember the name offhand. Does anyone?

 OK, so the keys function of Map should work just fine for a Trie
 implementation. Good to know.

This wins a little battle but not the war. Long term you're facing the 
sysyphic job of either knocking new containers into the existing 
interfaces, or extending the interfaces to accommodate new containers. 
It doesn't look to me like a winning proposition.

The FIC (Federation of Independent Containers) does not have that 
problem. It does have its specifications and guidelines but whichever 
container doesn't support some or even all of the required methods can 
simply define its own under other names. Then users can play with the 
containers and with the ranges they define as they find fit.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 17:47:18 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:38 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 17:35:11 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 04:08 PM, Steven Schveighoffer wrote:
 On Mon, 24 May 2010 16:27:46 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 Sorry. Yes, by-key iteration should be possible.

 OK, so we should be able to iterate keys. And the keys are not stored  
 in
 the trie structure itself. So how does one iterate the keys of the
 container without reconstructing them from the trie nodes using the
 heap?

 You can't. At some point you need to copy tree traces into T[] arrays.
 If the trie stores parent nodes, you don't need to do that as you can
 expose a trace as a double-ended range containing the key.

 There's a kind of trie that was defined to avoid such issues; it
 stores the keys in clear, in the leaves, at the expense of
 duplication. I don't remember the name offhand. Does anyone?

 OK, so the keys function of Map should work just fine for a Trie
 implementation. Good to know.

 This wins a little battle but not the war. Long term you're facing the  
 sysyphic job of either knocking new containers into the existing  
 interfaces, or extending the interfaces to accommodate new containers.  
 It doesn't look to me like a winning proposition.

It's always easy to say that there may come a day when the interfaces  
don't work -- none of us can see the future.  When that day comes, we can  
find a solution to it.  The worst case scenario is that you simply don't  
implement any interfaces.  It does not detract from the interfaces that  
exist.  It would seem odd that some of the collections do not implement  
the interfaces while others do.  At the very least though, all containers  
should be iterable, meaning they will implement the Iterator!V interface.   
That at least allows it to interact with the other container types.

On the flip side, if containers did not implement interfaces, having to do  
this:

class WrappedSet!(alias Impl, V) : Set!V
{
    private Impl!V impl;
    int functionToSatisfySet() { return impl.functionToSatisfySet(); }
    ...
}

seems to me like a lot more crufty and bloated than simply adding : Set!V  
on the end of the class declarations.

But I'm willing to drop the interfaces for now since interfaces obviously  
are an unpopular choice.

-Steve

May 24 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Steven Schveighoffer <schveiguy yahoo.com> wrote:
 On the flip side, if containers did not implement interfaces, having to  
 do this:

 class WrappedSet!(alias Impl, V) : Set!V
 {
     private Impl!V impl;
     int functionToSatisfySet() { return impl.functionToSatisfySet(); }
     ...
 }

 seems to me like a lot more crufty and bloated than simply adding :  
 Set!V on the end of the class declarations.

This would not be necessary. We can get the function names off of the
interface, so we could have a template do this for us.

-- 
Simen

May 24 2010

dsimcha <dsimcha yahoo.com> writes:

Sheer awesomeness!  I've been waiting for along time for a decent collections
lib
for D2.  I haven't tested it extensively, but it definitely builds w/o a hitch
(using a custom build script I wrote that builds dcollections and other libs I
use
in one go) and works for the toy examples I tested.

One small comment, though:  Looking at your ArrayList code, wouldn't it make
more
sense to use a capacity field here than to use the builtin array appending
mechanism, given that this is a class, not a struct?

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

dsimcha Wrote:

 One small comment, though:  Looking at your ArrayList code, wouldn't it make
more
 sense to use a capacity field here than to use the builtin array appending
 mechanism, given that this is a class, not a struct?

Yes, that is part of my semi-documented long term plan.  I actually am planning
on adding an "extendLength" runtime function to support extending the length of
an array as much as possible without reallocating.  That way I can make
ArrayList independent of the allocation sizes that the GC supports.

-Steve

May 19 2010

Graham Fawcett <fawcett uwindsor.ca> writes:

Hi Steven,

On Wed, 19 May 2010 12:09:11 -0400, Steven Schveighoffer wrote:

 After much work and toil, I have created dcollections for D2.  I think I
 can say that this is the first collection package available for D2.  I
 still hold hope that it will be used as the standard collection package
 for phobos for D2.

Cool!

In case anyone's interested, I tried out dcollections using my very
alpha 'd-build' script, and it works just fine:

http://github.com/gmfawcett/d-build/tree/dcollections-test

tarball of the dcollections test branch:
http://github.com/gmfawcett/d-build/tarball/dcollections-test

Checking out the 'dcollections-test' branch of that git project, you
should be able to just run './d-build test.d' and it will first
download dcollections and then compile it along with the test program.
There are several limitations; see the README for details.

There's a long way to go before d-build is more than a toy. But I'd
like to keep it on people's radars, and am interested in your thoughts
and feedback. See the "envy for go packages" threads on this list for
context.

Regards,
Graham


 
 Please visit http://www.dsource.org/projects/dcollections for details.
 
 Some VERY IMPORTANT notes for this release:
 
 1. it is very beta software.  Although the implementation has changed
 very little from the 1.0 version, not everything will properly compile. 
 I have filed several compiler bugs to fix some problems I had while
 porting, and inserted workarounds for things I could work around. 
 Please use the LATEST compiler (currently 2.046) because some very
 important features have been added that dcollections relies on. 
 Although it is beta, the algorithms and implementation is largely
 unchanged so functionality should be pretty solid.
 
 2. the docs are not online for d2.  Unfortunately, dsource's
 auto-doc-generator that I was using for the 1.0 version is based on a
 1.0 compiler, so it will not automatically generate the d2 docs. 
 However, I have included some basic ddoc generated docs in the download
 package, don't expect them to be very fun to use :)
 
 3. The docs are not fully updated, so they might be just flat out wrong!
 I plan to update the docs completely for the next beta release.
 
 So, for the good stuff... here are some notes for differences between
 1.0 and 2.0:
 
      * Ranges!
      * Filled out slicing for all containers * Cursors are now
      non-movable.  Use ranges for safe iteration. * The interfaces have
      been cut down significantly. The question I
 asked myself when deciding whether I wanted to keep an interface is
 "would anyone use this interface?"
      * Functions that should be fast but can be slow (O(n)) have been
 removed from all interfaces, and in most cases, from the containers.
 Notably missing is searching a non-quick lookup container for a value.
 Use std.algorithm.find for that.
      * ArrayMultiset has been removed -- it's complexity functions did
      not
 fit with the multiset requirements (specifically, quick lookup of an
 element's presence).
      * ArrayList slicing now simply returns a range instead of a "live"
 slice.  Note that an ArrayList range is aliased to a D array.
      * Full unit tests added for all container types.
 
 I have also changed the license from BSD to Boost.
 
 
 --------
 
 I have also formally released version 0.03 as version 1.0, nothing has
 changed there, including the license.  However, no new features will be
 added to dcollections version 1.0, only bug fixes will be done.
 
 Please, file tickets on dsource for any problems you have.  And let me
 know if there are any ideas you think would be useful.
 
 -Steve

May 19 2010

Graham Fawcett <fawcett uwindsor.ca> writes:

On Wed, 19 May 2010 20:27:17 +0000, Graham Fawcett wrote:

 Hi Steven,
 
 On Wed, 19 May 2010 12:09:11 -0400, Steven Schveighoffer wrote:
 
 After much work and toil, I have created dcollections for D2.  I think
 I can say that this is the first collection package available for D2. 
 I still hold hope that it will be used as the standard collection
 package for phobos for D2.

 
 Cool!
 
 In case anyone's interested, I tried out dcollections using my very
 alpha 'd-build' script, and it works just fine:
 
 http://github.com/gmfawcett/d-build/tree/dcollections-test
 
 tarball of the dcollections test branch:
 http://github.com/gmfawcett/d-build/tarball/dcollections-test
 
 Checking out the 'dcollections-test' branch of that git project, you
 should be able to just run './d-build test.d' and it will first download
 dcollections and then compile it along with the test program. There are
 several limitations; see the README for details.
 
 There's a long way to go before d-build is more than a toy. But I'd like
 to keep it on people's radars, and am interested in your thoughts and
 feedback. See the "envy for go packages" threads on this list for
 context.

Apologies; the "envy for go packages" thread is on the D list, not on
D.announce.

Graham

May 19 2010

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Steven Schveighoffer Wrote:

 After much work and toil, I have created dcollections for D2.  I think I  
 can say that this is the first collection package available for D2.  I  
 still hold hope that it will be used as the standard collection package  
 for phobos for D2.
 
 Please visit http://www.dsource.org/projects/dcollections for details.
 
 Some VERY IMPORTANT notes for this release:
 
 1. it is very beta software.  Although the implementation has changed very  
 little from the 1.0 version, not everything will properly compile.  I have  
 filed several compiler bugs to fix some problems I had while porting, and  
 inserted workarounds for things I could work around.  Please use the  
 LATEST compiler (currently 2.046) because some very important features  
 have been added that dcollections relies on.  Although it is beta, the  
 algorithms and implementation is largely unchanged so functionality should  
 be pretty solid.
 
 2. the docs are not online for d2.  Unfortunately, dsource's  
 auto-doc-generator that I was using for the 1.0 version is based on a 1.0  
 compiler, so it will not automatically generate the d2 docs.  However, I  
 have included some basic ddoc generated docs in the download package,  
 don't expect them to be very fun to use :)
 
 3. The docs are not fully updated, so they might be just flat out wrong!   
 I plan to update the docs completely for the next beta release.
 
 So, for the good stuff... here are some notes for differences between 1.0  
 and 2.0:
 
      * Ranges!
      * Filled out slicing for all containers
      * Cursors are now non-movable.  Use ranges for safe iteration.
      * The interfaces have been cut down significantly. The question I  
 asked myself when deciding whether I wanted to keep an interface is "would  
 anyone use this interface?"
      * Functions that should be fast but can be slow (O(n)) have been  
 removed from all interfaces, and in most cases, from the containers.  
 Notably missing is searching a non-quick lookup container for a value. Use  
 std.algorithm.find for that.
      * ArrayMultiset has been removed -- it's complexity functions did not  
 fit with the multiset requirements (specifically, quick lookup of an  
 element's presence).
      * ArrayList slicing now simply returns a range instead of a "live"  
 slice.  Note that an ArrayList range is aliased to a D array.
      * Full unit tests added for all container types.
 
 I have also changed the license from BSD to Boost.
 
 
 --------
 
 I have also formally released version 0.03 as version 1.0, nothing has  
 changed there, including the license.  However, no new features will be  
 added to dcollections version 1.0, only bug fixes will be done.
 
 Please, file tickets on dsource for any problems you have.  And let me  
 know if there are any ideas you think would be useful.
 
 -Steve

Well as long as you're here can I submit an error here? :)

I get an error while building the D2 branch:

Error: ArrayMultiset.obj : No such file or directory

It seems the ArrayMultiset.d is not in the dcollections folder for the D2.x 
branch, although it is in trunk (I guess the trunk is the D1.x one?). 

Is that module deprecated for d2.x? (although it's listed in the win32 batch
file)

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Andrej Mitrovic Wrote:


 Well as long as you're here can I submit an error here? :)
 
 I get an error while building the D2 branch:
 
 Error: ArrayMultiset.obj : No such file or directory

Crud, I admit that I assumed anything that built on Linux would build on
Windows.  I still believe it will, but of course, I need to change the batch
file :)

 
 It seems the ArrayMultiset.d is not in the dcollections folder for the D2.x 
branch, although it is in trunk (I guess the trunk is the D1.x one?). 

Yes, D1 is trunk, D2 is going to eventually be trunk

 
 Is that module deprecated for d2.x? (although it's listed in the win32 batch
file)

Yes, just remove it.  I'll fix it when I get a chance.

-Steve

May 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 19 May 2010 17:18:04 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 Andrej Mitrovic Wrote:


 Well as long as you're here can I submit an error here? :)

 I get an error while building the D2 branch:

 Error: ArrayMultiset.obj : No such file or directory

 Crud, I admit that I assumed anything that built on Linux would build on  
 Windows.  I still believe it will, but of course, I need to change the  
 batch file :)

 It seems the ArrayMultiset.d is not in the dcollections folder for the  
 D2.x  branch, although it is in trunk (I guess the trunk is the D1.x  
 one?).

 Yes, D1 is trunk, D2 is going to eventually be trunk

 Is that module deprecated for d2.x? (although it's listed in the win32  
 batch file)

 Yes, just remove it.  I'll fix it when I get a chance.

I've fixed the windows build files and recreated the zip/tarball files for  
2.0a.  Learned some batch-fu online and hopefully the new versions will be  
future-proof.

Please re-download the zipfile to build with windows.

Thanks

-Steve

May 27 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Ellery Newcomer Wrote:

 On 05/19/2010 03:07 PM, Steven Schveighoffer wrote:
 I haven't the slightest what a bimap is :)  I am not an expert in collections
or data structures, I just reimplement things I have understood.  My
implementations are basically copied from my algorithm book, and refined as
much as I can do.

 
 I think boost.bimap is where I saw it, though I don't don't use c++.
 
 I think it's a map, with values->keys is also a map

That sounds interesting, it might be doable like this:

interface Bimap(K, V) : Map(K, V), Map(V, K)

I'll have a look at it some time.

 
 I have a simple child/sibling tree implementation which I could probably 
 dust off and polish up if you want it. The method for visiting the 
 elements is kind of weird, though. And I don't know that it exactly fits 
 the mold of a reference container. Maybe with cursors.

With opApply, you should have few restrictions on iteration.  Technically,
cursors are optional since they are part of the concrete class, but it probably
wouldn't make it into dcollections proper without them.

 
 Ugh, I just noticed LinkList doesn't work with interfaces.

You mean

interface I {}
auto list = new LinkList!I;

??
Please file a ticket w/ test case, it should work.

-Steve

May 19 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/19/2010 03:55 PM, Steven Schveighoffer wrote:
 Ellery Newcomer Wrote:
 I have a simple child/sibling tree implementation which I could probably
 dust off and polish up if you want it. The method for visiting the
 elements is kind of weird, though. And I don't know that it exactly fits
 the mold of a reference container. Maybe with cursors.

 With opApply, you should have few restrictions on iteration.  Technically,
cursors are optional since they are part of the concrete class, but it probably
wouldn't make it into dcollections proper without them.

When I was using it, it was usually more than an iteration that I found 
I needed. I think I had preorder traversals and postorder traversals 
with the ability to define actions at parent -> child, child -> sibling, 
and child -> parent transitions, access to the stack and some of the 
history of what had been visited. On the whole, it required heavy 
exposure of the structure of the tree. A wrapper around the tree nodes 
doesn't make a lot of sense, and if you don't have a wrapper, cursors 
don't really make a lot of sense either.

Random thought: wouldn't a child/sibling tree be a good base for 
implementing tries?

 Ugh, I just noticed LinkList doesn't work with interfaces.

 You mean

 interface I {}
 auto list = new LinkList!I;

 ??
 Please file a ticket w/ test case, it should work.

 -Steve

indeed it should, but D doesn't like you :)

http://www.dsource.org/projects/dcollections/ticket/4

May 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 11:09 AM, Steven Schveighoffer wrote:
 After much work and toil, I have created dcollections for D2.

Awesome! I'm looking through it, unfortunately after wandering in the 
trunk for a while (I was like, wait, what?). But that was after all good 
because I saw a lot of awesome improvements in the new library. This is 
solid work.

I salute your decision to frame complexity as a feature and remove 
functions with "undecided complexity". This is huge.

That being said, I think dcollections2 has not yet solved a number of 
problems, some minor, some annoying, and some fundamental. This makes 
things quite unpleasant because, willingly or not, I'm in a three-way 
conflict of interest: (1) I can influence to some extent the decision of 
adopting dcollections2 for phobos; (2) I have my own design in mind 
which is competing with dcollections2; (3) my time is scarce so I'm 
having difficulty executing on that design, whereas dcollections2 is 
already usable.

I'm not sure where this leaves us. I'm tempted to post a list of 
"grievances" with the current design, but it's difficult to make that 
sound sincere due to the conflict of interest. Let me start by asking 
this: on a scale of 0 ("no change") to 10 ("redesign the whole thing"), 
where do you stand regarding the perspective of operating changes to 
dcollections2?


Andrei

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Andrei Alexandrescu Wrote:

 On 05/19/2010 11:09 AM, Steven Schveighoffer wrote:
 After much work and toil, I have created dcollections for D2.

 
 Awesome! I'm looking through it, unfortunately after wandering in the 
 trunk for a while (I was like, wait, what?). But that was after all good 
 because I saw a lot of awesome improvements in the new library. This is 
 solid work.

Yeah, I will move the D1 version to a branch and copy the D2 to trunk when I
get a chance, Part of the reason is the automatic doc generator will puke if
the trunk is D2.

 That being said, I think dcollections2 has not yet solved a number of 
 problems, some minor, some annoying, and some fundamental. This makes 
 things quite unpleasant because, willingly or not, I'm in a three-way 
 conflict of interest: (1) I can influence to some extent the decision of 
 adopting dcollections2 for phobos; (2) I have my own design in mind 
 which is competing with dcollections2; (3) my time is scarce so I'm 
 having difficulty executing on that design, whereas dcollections2 is 
 already usable.
 
 I'm not sure where this leaves us. I'm tempted to post a list of 
 "grievances" with the current design, but it's difficult to make that 
 sound sincere due to the conflict of interest. Let me start by asking 
 this: on a scale of 0 ("no change") to 10 ("redesign the whole thing"), 
 where do you stand regarding the perspective of operating changes to 
 dcollections2?

It depends on what you want to change.  If you say cursors have to go, then I
would say no.  If you think some functions/classes need to be renamed, or the
module structure needs to be changed, I'd say fine.  

There are other design decisions that can be changed, but I look at it from the
perspective of usefulness to me :)  If it stops being useful to me, or does not
provide a feature I want, then I'll not use it, and then I won't want to
maintain it.

I guess start with the big ones, 'cause minor ones aren't likely to be a
problem.

-Steve

May 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 04:37 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:
 I'm not sure where this leaves us. I'm tempted to post a list of
 "grievances" with the current design, but it's difficult to make
 Let me start by
 asking this: on a scale of 0 ("no change") to 10 ("redesign the
 whole thing"), where do you stand regarding the perspective of
 operating changes to dcollections2?

 It depends on what you want to change.  If you say cursors have to
 go, then I would say no.  If you think some functions/classes need to
 be renamed, or the module structure needs to be changed, I'd say
 fine.

 There are other design decisions that can be changed, but I look at
 it from the perspective of usefulness to me :)  If it stops being
 useful to me, or does not provide a feature I want, then I'll not use
 it, and then I won't want to maintain it.

 I guess start with the big ones, 'cause minor ones aren't likely to
 be a problem.

Well the thing is I'm strongly questioning the whole point of defining a 
hierarchy for containers, in particular when the interfaces are 
numerous. I'd go as heretical as saying "Container hierarchy are for 
languages that suck." Also, the fact that (for efficiency reasons) 
ranges are concrete and inaccessible from the abstract interfaces 
aggravates the matter even more, to the point where containers are 
neither horses nor mules: if you use the interfaces you have only 
severely emasculated access to the container's elements, and if you use 
the concrete classes why the heck bother with hierarchies in the first 
place.

Whew. Let me explain; there's a lot to explain.

Right now there are 10 interfaces in 7 files in the /model/ subdir. It 
happens quite often that a given class inherits more than one interface, 
for example ArrayList inherits two, and many set types inherit Set which 
in turn inherits Addable!V, Iterator!V, Purgeable!V.

The problem with this setup that encodes capabilities in interfaces is 
that many sensible combinations of capabilities are impossible to 
obtain. Say you want to define a function messWith(C) where C is 
Purgeable and Addable. There's no type for that! Set has too many 
capabilities (which not all containers might support), so you'll need to 
do a runtime check:

void messWith(Addable!int ia) {
     auto ip = cast(Purgeable!int) ia;
     enforce(ip, "Wrong type etc.");
     ...
}

It would be more expressive if such capability constraints could be 
expressed during compilation.

To see more about this problem, you may want to check "Compound types 
for Java" by Buechi and Wech (just google it). The paper explains the 
problem in detail and proposes a Java language change to fix it. Java 
didn't change and was not able to devise a library solution.

I wrote a solution to the problem in native D. It goes like this:

alias Container!(int, addable | purgeable) Messerschmidt;

void messWith(Messerschmidt i) {
     ... use i's capabilities to add and purge ...
}

The inspiration for this approach was given by Scott Meyers' article 
"Red code, green code". The approach does work and rather beautifully, 
but it's impractical: the class hierarchy forms a dense DAG and if you 
add 4-5 capabilities an empty object already has size 8KB consisting 
only of routing vtables.

Escalating this discussion one step further (and probably a bit outside 
the area of commonly-accepted lore), let me hypothesize that maybe the 
entire idea of container hierarchies is a red herring, the hoax of the 
1990s. Hierarchies are for types that have a lot of commonality and a 
little variability. Containers are not that. They have personality, 
refuse to accept straitjacket interfaces, and each has unique 
capabilities - as your capability-based design witnesses.

I agree that reuse via binary interfaces is good when you write 
functions that exploit them, but in D it's simple and cheap enough to 
write a concept-checked generic function to get around that. In my 
designs I either know what container I'm dealing with, or I am in 
generic-land. I never was like, "mmm, great, I can change this container 
type at runtime".

But wait, there's less. Furthermore, container hierarchies encourage 
design by inheritance, which is more often than not poor. If you want to 
define a container that notifies an observer whenever you add stuff to 
it, composition is the best to follow. You don't want a horse with a 
violing grafted on its back; keep the horse a horse and play the violin 
and it'll dance.

I've never, ever been in a place in life when I thought, "I should 
derive from this container class and hook a method of it." Composition 
always wins.

To get back to one of my earlier points, the fact that the container 
interfaces are unable to express iteration is a corollary of the 
design's problem and an climactic disharmony.

My vision, in very brief, is to foster a federation of independent 
containers abiding to identical names for similar functionality. Then a 
few concept checks (a la std.range checks) can easily express what 
capabilities a given client function needs from a container.

Destroy me :o).


Andrei

May 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:
 Destroy me :o).

You ideas are surely interesting, but I don't think there's a simple way to
change his code according to your ideas.
People can use the dcollections in the following weeks and months, and when you
have implemented your ideas the people that like them can switch to using your
collections.

Bye,
bearophile

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

bearophile Wrote:

 Andrei Alexandrescu:
 Destroy me :o).

 
 You ideas are surely interesting, but I don't think there's a simple way to
change his code according to your ideas.

I don't think the ideas are mutually exclusive.  I don't see why having an
interface prevents someone from using a concept on the concrete class.

 People can use the dcollections in the following weeks and months, and when
you have implemented your ideas the people that like them can switch to using
your collections.

If it's at all possible, I'd like to cooperate on making dcollections
phobos-qualified.  So I'm glad to try and find a way to satisfy Andrei's
concerns.

-Steve

May 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 If it's at all possible, I'd like to cooperate on making dcollections
phobos-qualified.  So I'm glad to try and find a way to satisfy Andrei's
concerns.

Very good. Sometimes I am a too much pessimistic person :-)

Bye,
bearophile

May 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 08:53 PM, Steven Schveighoffer wrote:
 bearophile Wrote:

 Andrei Alexandrescu:
 Destroy me :o).

 You ideas are surely interesting, but I don't think there's a
 simple way to change his code according to your ideas.

 I don't think the ideas are mutually exclusive.  I don't see why
 having an interface prevents someone from using a concept on the
 concrete class.

 People can use the dcollections in the following weeks and months,
 and when you have implemented your ideas the people that like them
 can switch to using your collections.

 If it's at all possible, I'd like to cooperate on making dcollections
 phobos-qualified.  So I'm glad to try and find a way to satisfy
 Andrei's concerns.

I'm very grateful to hear that; as I said, dcollections embody a great 
deal of solid work for which I have all admiration. If we disagree, I 
hope we won't make this a willpower struggle :o). Let the best ideas 
win. That being said, how something is said matters even if the said is 
true; I'll be extra careful with that.

Andrei

May 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 07:21 PM, bearophile wrote:
 Andrei Alexandrescu:
 Destroy me :o).

 You ideas are surely interesting, but I don't think there's a simple way to
change his code according to your ideas.
 People can use the dcollections in the following weeks and months, and when
you have implemented your ideas the people that like them can switch to using
your collections.

 Bye,
 bearophile

I am sorry, but I've been quite sick since yesterday afternoon. I still 
can't think straight because of fever, but once it subsides, I'll reply 
to some of the points made in this thread.

Andrei

May 20 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 I am sorry, but I've been quite sick since yesterday afternoon.

Get better and in the meantime don't worry for D.

Bye,
bearophile

May 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 07:21 PM, bearophile wrote:
 Andrei Alexandrescu:
 Destroy me :o).

 You ideas are surely interesting, but I don't think there's a simple
 way to change his code according to your ideas. People can use the
 dcollections in the following weeks and months, and when you have
 implemented your ideas the people that like them can switch to using
 your collections.

 Bye, bearophile

I actually believe there is a simple transition path. In essence the 
interfaces in model/ should be rewritten as isXxx tests and the 
containers should have all of their methods final.

Andrei

May 21 2010

Graham St Jack <Graham.StJack internode.on.net> writes:

While I haven't read dcollections yet, I definitely agree with you about 
not liking container hierarchies, and about the importance of support for 
ranges.

I hope Steven can be convinced that this is a good way to go :-).

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Graham St Jack Wrote:

 While I haven't read dcollections yet, I definitely agree with you about 
 not liking container hierarchies, and about the importance of support for 
 ranges.
 
 I hope Steven can be convinced that this is a good way to go :-).

All dcollections classes support ranges.  I am already convinced of that :)

-Steve

May 19 2010

Bill Baxter <wbaxter gmail.com> writes:

On Wed, May 19, 2010 at 4:01 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a few
 concept checks (a la std.range checks) can easily express what capabilities
 a given client function needs from a container.

 Destroy me :o).

So instead of STL's concept hierarchy, you have essentially concept
tags.  Very Web 2.0. :-)

I agree that there doesn't seem to be any coding benefit to STL's
concepts being hierarchical.  If you need a push_back(), you've got to
check for push_back(). The main benefit seems to be for  documentation
purposes, allowing you to say things like "bidirectional_iterator has
this and that, plus everything in forward_iterator".  But that could
easily be rephrased as "it has backward_iteration plus
forward_iteration" with two pages describing those two tags.

So I like the sound of it.  But it seems actually a pretty small
departure from the STL approach, in practice.

--bb

May 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 07:57 PM, Bill Baxter wrote:
 On Wed, May 19, 2010 at 4:01 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a few
 concept checks (a la std.range checks) can easily express what capabilities
 a given client function needs from a container.

 Destroy me :o).

 So instead of STL's concept hierarchy, you have essentially concept
 tags.  Very Web 2.0. :-)

 I agree that there doesn't seem to be any coding benefit to STL's
 concepts being hierarchical.  If you need a push_back(), you've got to
 check for push_back(). The main benefit seems to be for  documentation
 purposes, allowing you to say things like "bidirectional_iterator has
 this and that, plus everything in forward_iterator".  But that could
 easily be rephrased as "it has backward_iteration plus
 forward_iteration" with two pages describing those two tags.

 So I like the sound of it.  But it seems actually a pretty small
 departure from the STL approach, in practice.

Well in fact STL has a concept hierarchy for iterators (which D also has 
for ranges), and a flat, unstructured approach to container.

I don't mind keeping what STL does if there's no good reason. One change 
I do think is beneficial is making containers reference types by default.


Andrei

May 21 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the container 
 interfaces are unable to express iteration is a corollary of the 
 design's problem and an climactic disharmony.
 
 My vision, in very brief, is to foster a federation of independent 
 containers abiding to identical names for similar functionality. Then a 
 few concept checks (a la std.range checks) can easily express what 
 capabilities a given client function needs from a container.

This might have a simple answer.  Dcollections implementations are not a
hierarchy, just the interfaces are.  I.e. there aren't many kinds of HashMaps
that derive from each other.  But the interfaces are not detrimental to your
ideas.  The only thing interfaces require is that the entities implementing
them are classes and not structs.  As long as you agree that classes are the
right call, then interfaces can co-exist with your other suggestions without
interference.

Yes, if you want to define "this function needs something that is both addable
and purgeable, I don't have an interface for that.  But a concept can certainly
define that generically (which is what you want anyways), or you could just say
"I need a List" and get those functions also.  It also does not force entities
other than dcollections objects to be classes, they could be structs and
implement the correct concepts.

I myself don't really use the interface aspect of the classes, it is mostly a
carryover from the Java/Tango inspirations.  But I can see one good reason to
keep them -- binary interoperability.  For example, it might be the case some
day when D has good support with dynamic libraries that a library exposes some
piece of itself as a Map or List interface.

So my answer is -- go ahead and define these concepts and required names, and
you can ignore the interfaces if they don't interest you.  They do not subtract
from the possibilities, and others may find good use for them.

Does that make sense?

-Steve

May 19 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:
 Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

 This might have a simple answer.  Dcollections implementations are not a  
 hierarchy, just the interfaces are.  I.e. there aren't many kinds of  
 HashMaps that derive from each other.  But the interfaces are not  
 detrimental to your ideas.  The only thing interfaces require is that  
 the entities implementing them are classes and not structs.  As long as  
 you agree that classes are the right call, then interfaces can co-exist  
 with your other suggestions without interference.

 Yes, if you want to define "this function needs something that is both  
 addable and purgeable, I don't have an interface for that.  But a  
 concept can certainly define that generically (which is what you want  
 anyways), or you could just say "I need a List" and get those functions  
 also.  It also does not force entities other than dcollections objects  
 to be classes, they could be structs and implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it is  
 mostly a carryover from the Java/Tango inspirations.  But I can see one  
 good reason to keep them -- binary interoperability.  For example, it  
 might be the case some day when D has good support with dynamic  
 libraries that a library exposes some piece of itself as a Map or List  
 interface.

 So my answer is -- go ahead and define these concepts and required  
 names, and you can ignore the interfaces if they don't interest you.   
 They do not subtract from the possibilities, and others may find good  
 use for them.

 Does that make sense?

 -Steve

Yes and No. I understand where your coming from, but I think it's a bad  
idea. First, I think it needlessly expands the radius of comprehension  
needed to understand and use the library. (See Tangled up in tools  
http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I  
think designing a library to be flexible enough to meet some future,  
anticipated need (e.g. dlls) is a good idea, but actually implementing  
vaporous future needs is fraught with peril; it's too easy to guess wrong.  
Third, interface base design is viral; If library X uses interfaces then I  
have to use interfaces to interface with it. And if another library Y uses  
classes, then I'm going have to write a (needless) wrapper around one of  
them.

May 19 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer 
 Does that make sense?

 -Steve

 
 Yes and No. I understand where your coming from, but I think it's a bad  
 idea. First, I think it needlessly expands the radius of comprehension  
 needed to understand and use the library. (See Tangled up in tools  
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I  
 think designing a library to be flexible enough to meet some future,  
 anticipated need (e.g. dlls) is a good idea, but actually implementing  
 vaporous future needs is fraught with peril; it's too easy to guess wrong.  
 Third, interface base design is viral; If library X uses interfaces then I  
 have to use interfaces to interface with it. And if another library Y uses  
 classes, then I'm going have to write a (needless) wrapper around one of  
 them.

I understand these points, but I'm already using interfaces to copy data
between containers.  I don't have to, I could have used generic code, but this
way, only one function is instantiated to copy data from all the other
containers.  The problem with using generic code is that the compiler will
needlessly duplicate functions that are identical.

Using interfaces is not as viral as you think.  My interfaces can be used in
generic code, as long as the generic code uses functions in the interfaces.  If
a library returns an interface, the author is saying "I don't want you using
any functions outside this interface," so why is that a bad thing?

Forcing people to *not* use interfaces has its drawbacks too.  Dcollections
gives the most flexible design I could muster, while still being useful.

I'm not saying I'm against removing the interfaces until some later date, but I
don't see any convincing arguments yet, especially since I've already seen
benefits from having them.

-Steve

May 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:

 I understand these points, but I'm already using interfaces to copy 
 data between containers.  I don't have to, I could have used generic 
 code, but this way, only one function is instantiated to copy data from 
 all the other containers.  The problem with using generic code is that 
 the compiler will needlessly duplicate functions that are identical.

One question. Have you calculated the speed difference between using an 
interface and using generic code? Surely going through all those 
virtual calls slows things down a lot.

I do like interfaces in principle, but I fear it'll make things much 
slower when people implement things in term of interfaces. That's why 
I'm not sure it's a good idea to offer container interfaces in the 
standard library.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 20 2010

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:

 Surely going through all those virtual calls slows things down a lot.<

Right. But the purpose of a good compiler is to kill those, devirtualizing.
LLVM devs are working on this too. See:
http://llvm.org/bugs/show_bug.cgi?id=6054
http://llvm.org/bugs/show_bug.cgi?id=3100

---------------------

Steven Schveighoffer:

The problem with using generic code is that the compiler will needlessly
duplicate functions that are identical.<

See the  -mergefunc  compiler switch of LDC, to merge identical functions (like
ones produced by template instantiations). This feature is not very powerful
yet, but it's present and I have seen it works. See:
http://llvm.org/bugs/show_bug.cgi?id=6712
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108275

Bye,
bearophile

May 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-20 08:30:59 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:
 
 Surely going through all those virtual calls slows things down a lot.<

 
 Right. But the purpose of a good compiler is to kill those, 
 devirtualizing. LLVM devs are working on this too. See:
 http://llvm.org/bugs/show_bug.cgi?id=6054
 http://llvm.org/bugs/show_bug.cgi?id=3100

Devirtualization is only possible in certain cases: when the function 
knows exactly which type it'll get. But downcasting to a more generic 
type and passing it around function calls strips it of this precise 
information required for devirtualization. The only way to propagate 
the exact type is to either instanciate a new version of the function 
you call for that specific type (which is what a template does) or 
inline it (because it also creates a new instance of the function, 
inline inside the caller).

For instance:

	void test1(List list) {
		list.clear(); // can't devirtualize since we do now know which kind 
of list we'll get
	}

	void test2() {
		List list = new ArrayList;
		list.clear(); // now the compiler can see we'll always have an 
ArrayList, can devritualize
	}

	void test3(L)(L list) {
		list.clear(); // parent's type is propagated, can devirtualize if parent can.
	}


 Steven Schveighoffer:
 
 The problem with using generic code is that the compiler will 
 needlessly duplicate functions that are identical.<

 
 See the  -mergefunc  compiler switch of LDC, to merge identical 
 functions (like ones produced by template instantiations). This feature 
 is not very powerful yet, but it's present and I have seen it works.

Indeed. I'm no expert in linkers, but in my opinion this is one of the 
most basic optimizations a linker should perform. And C++ has pushed 
linkers to do that for years now so I'd expect most linkers to do that 
already...

The problem with templates are more the multiple slightly different 
instanciations. In general this is good for performance, but it's only 
needed for the code paths that need to be fast. I think generic 
containers should be fast.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 20 2010

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:

Devirtualization is only possible in certain cases: when the function knows
exactly which type it'll get.<

You are wrong, in most cases there are ways to de-virtualize, even when the
runtime type isn't exactly known, but sometimes to do it you have to work too


It's a complex topic, I suggest you to read about it, I can't explain here, see
polymorpic call points, megamorphic ones, dispatch trees, and so on.

You can read something here too:
http://www.digitalmars.com/webnews/newsgroups/newsgroups.php?art_group=digitalmars.D&article_id=105630

LLVM is not able to do this yet well, but once the -O7 optimization level is
available it will have the means (if not the capability) to devirtualize as
well as HotSpot:
http://linux.die.net/man/1/llvmc
And even if you don't compile your code with -O7 there are ways to improve the
situation anyway with static code analysis (but usually this is not as good as
type information collected at runtime or during profiling for profile-based
optimization).


Indeed. I'm no expert in linkers, but in my opinion this is one of the most
basic optimizations a linker should perform.<

At its basic level it's an easy optimization, but as usual if you want to
implement something well, things gets harder :-)
In LDC it's the compiler that is performing this optimization, and only if you
manually add the -mergefunc switch, that is not used even in -O3.


The problem with templates are more the multiple slightly different
instanciations.<

This is what I was talking about with things getting "harder". A good
implementation of this feature can recognize slightly different functions too,
and remove such partial redundancy. This is doable, but LLVM is not able to do
this yet. The good thing is that LLVM is improving quickly still.


In general this is good for performance, but it's only needed for the code
paths that need to be fast. I think generic containers should be fast.<

I am not sure what you mean here, but you can be interested in this thread
started by me:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=108136

Bye,
bearophile

May 20 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-20 15:47:22 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:
 
 Devirtualization is only possible in certain cases: when the function 
 knows exactly which type it'll get.<

 
 You are wrong, in most cases there are ways to de-virtualize, even when 
 the runtime type isn't exactly known, but sometimes to do it you have 

 optimization.
 
 It's a complex topic, I suggest you to read about it, I can't explain 
 here, see polymorpic call points, megamorphic ones, dispatch trees, and 
 so on.

I know I simplified a bit, and I'll admit you may know more than me 
about dynamic dispatch optimizations. But if I'm not mistaken other 
devirtualizing solutions are creating multiple instances of the code 
path based on either static info or a runtime switch. All this isn't 
very different from calling a templated function, but it's more complex 
and less reliable (because those optimizations might not be in the 
compiler, or might not be applicable to all scenarios).

My point all along was that it's better to stick with templates, where 
you're "guarantied" to get the "optimal" code path. The downside is 
that you lose the dynamic dispatch capabilities. But I believe those 
are rarely needed in most programs. And if you really need it it's 
quite easy and efficient to layer dynamic dispatch over static calls 
(much more than the reverse).

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/20/2010 02:47 PM, bearophile wrote:
 Michel Fortin:

 Devirtualization is only possible in certain cases: when the
 function knows exactly which type it'll get.<

 You are wrong, in most cases there are ways to de-virtualize, even
 when the runtime type isn't exactly known, but sometimes to do it you

 this optimization.

If we get deeper into this branch, we'll forget where we started from 
(are interfaces sensible for this design?) and we'll reframe the 
interfaces vs. no interfaces decision into a speed loss vs. no speed 
loss decision. There are other arguments to look at besides performance.


Andrei

May 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/20/2010 09:41 AM, Michel Fortin wrote:
 On 2010-05-20 08:30:59 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:

 Surely going through all those virtual calls slows things down a lot.<

 Right. But the purpose of a good compiler is to kill those,
 devirtualizing. LLVM devs are working on this too. See:
 http://llvm.org/bugs/show_bug.cgi?id=6054
 http://llvm.org/bugs/show_bug.cgi?id=3100

 Devirtualization is only possible in certain cases: when the function
 knows exactly which type it'll get. But downcasting to a more generic
 type and passing it around function calls strips it of this precise
 information required for devirtualization.

(I assume you meant upcasting.) Even then, it's possible to accelerate 
calls by doing method casing, type casing, class hierarchy analysis...

 The only way to propagate the
 exact type is to either instanciate a new version of the function you
 call for that specific type (which is what a template does) or inline it
 (because it also creates a new instance of the function, inline inside
 the caller).

 For instance:

 void test1(List list) {
 list.clear(); // can't devirtualize since we do now know which kind of
 list we'll get
 }

 void test2() {
 List list = new ArrayList;
 list.clear(); // now the compiler can see we'll always have an
 ArrayList, can devritualize
 }

 void test3(L)(L list) {
 list.clear(); // parent's type is propagated, can devirtualize if parent
 can.
 }


 Steven Schveighoffer:

 The problem with using generic code is that the compiler will
 needlessly duplicate functions that are identical.<

 See the -mergefunc compiler switch of LDC, to merge identical
 functions (like ones produced by template instantiations). This
 feature is not very powerful yet, but it's present and I have seen it
 works.

 Indeed. I'm no expert in linkers, but in my opinion this is one of the
 most basic optimizations a linker should perform. And C++ has pushed
 linkers to do that for years now so I'd expect most linkers to do that
 already...

 The problem with templates are more the multiple slightly different
 instanciations. In general this is good for performance, but it's only
 needed for the code paths that need to be fast. I think generic
 containers should be fast.

There's been talk that Walter's slow porting of the linker to C and then 
D will put him in the position to introduce such an optimization.


Andrei

May 21 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 See the -mergefunc compiler switch of LDC, to merge identical
 functions (like ones produced by template instantiations). This
 feature is not very powerful yet, but it's present and I have seen it
 works.

 Indeed. I'm no expert in linkers, but in my opinion this is one of the
 most basic optimizations a linker should perform. And C++ has pushed
 linkers to do that for years now so I'd expect most linkers to do that
 already...

 The problem with templates are more the multiple slightly different
 instanciations. In general this is good for performance, but it's only
 needed for the code paths that need to be fast. I think generic
 containers should be fast.

 
 There's been talk that Walter's slow porting of the linker to C and then 
 D will put him in the position to introduce such an optimization.

Unfortunately, it's an agonizingly slow process. It also does nothing for Linux 
or OSX.

May 21 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Michel Fortin Wrote:

 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 I understand these points, but I'm already using interfaces to copy 
 data between containers.  I don't have to, I could have used generic 
 code, but this way, only one function is instantiated to copy data from 
 all the other containers.  The problem with using generic code is that 
 the compiler will needlessly duplicate functions that are identical.

 
 One question. Have you calculated the speed difference between using an 
 interface and using generic code? Surely going through all those 
 virtual calls slows things down a lot.
 
 I do like interfaces in principle, but I fear it'll make things much 
 slower when people implement things in term of interfaces. That's why 
 I'm not sure it's a good idea to offer container interfaces in the 
 standard library.

It's not that much slower.  You get a much higher speedup when things can be
inlined than virtual vs. non-virtual.

However, I should probably make all the functions in the concrete
implementations final.  I made several of them final, but I should do it across
the board.

One thing I just thought of -- in dcollections, similar types can be compared
to one another.  For example, you can check to see if a HashSet is equal to a
TreeSet.  But that would not be possible without interfaces.

-Steve

May 20 2010

Pelle <pelle.mansson gmail.com> writes:

On 05/20/2010 03:22 PM, Steven Schveighoffer wrote:
 One thing I just thought of -- in dcollections, similar types can be compared
to one another.  For example, you can check to see if a HashSet is equal to a
TreeSet.  But that would not be possible without interfaces.

 -Steve

I'm sorry, but I think that's a misfeature. In my opinion, a tree is not 
equal to a hash table, ever.

You could compare the content separately, but that wouldn't require 
interfaces.

May 20 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Pelle Wrote:

 On 05/20/2010 03:22 PM, Steven Schveighoffer wrote:
 One thing I just thought of -- in dcollections, similar types can be compared
to one another.  For example, you can check to see if a HashSet is equal to a
TreeSet.  But that would not be possible without interfaces.

 -Steve

 
 I'm sorry, but I think that's a misfeature. In my opinion, a tree is not 
 equal to a hash table, ever.

They aren't trees and hashtables, they are sets.  The "quick lookup" feature is
implemented via a hashtable or tree.  Basically, a set is defined by the exact
elements in it, regardless of order.  Comparing two sets for equality should
always be possible.

-Steve

May 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/20/2010 09:14 AM, Pelle wrote:
 On 05/20/2010 03:22 PM, Steven Schveighoffer wrote:
 One thing I just thought of -- in dcollections, similar types can be
 compared to one another. For example, you can check to see if a
 HashSet is equal to a TreeSet. But that would not be possible without
 interfaces.

 -Steve

 I'm sorry, but I think that's a misfeature. In my opinion, a tree is not
 equal to a hash table, ever.

Yes. By the way, TDPL's dogma imposes that a == b for classes means they 
have the same dynamic type.


Andrei

May 21 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-20 09:22:27 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:

 Michel Fortin Wrote:
 
 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data from
 all the other containers.  The problem with using generic code is that
 the compiler will needlessly duplicate functions that are identical.

 
 One question. Have you calculated the speed difference between using an
 interface and using generic code? Surely going through all those
 virtual calls slows things down a lot.
 
 I do like interfaces in principle, but I fear it'll make things much
 slower when people implement things in term of interfaces. That's why
 I'm not sure it's a good idea to offer container interfaces in the
 standard library.

 
 It's not that much slower.  You get a much higher speedup when things 
 can be inlined than virtual vs. non-virtual.

Yes, but that was part of the equation: a call to a template function 
can be inlined, not a virtual call (most of the time).

 However, I should probably make all the functions in the concrete 
 implementations final.  I made several of them final, but I should do 
 it across the board.

Yeah, probably.

 One thing I just thought of -- in dcollections, similar types can be 
 compared to one another.  For example, you can check to see if a 
 HashSet is equal to a TreeSet.  But that would not be possible without 
 interfaces.

I'm not sure of what you're saying here. Are you saying it can be done 
with an interface but not with a template type? Why can't this work:

	class TreeSet {
		bool opEquals(T)(T t) if (IsSet!T) {
			if (t.length != this.length)
				return false;
			foreach (element; this) {
				if (!t.has(element))
					return false;
			}
			return true;
		}
	}

	TreeSet a = new TreeSet;
	HashSet b = new HashSet;
	assert(a == b);

(sorry, I still have to start using the new operator overloading syntax.)

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 20 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

Michel Fortin Wrote:

 On 2010-05-20 09:22:27 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 Michel Fortin Wrote:
 
 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com> said:
 
 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data from
 all the other containers.  The problem with using generic code is that
 the compiler will needlessly duplicate functions that are identical.

 
 One question. Have you calculated the speed difference between using an
 interface and using generic code? Surely going through all those
 virtual calls slows things down a lot.
 
 I do like interfaces in principle, but I fear it'll make things much
 slower when people implement things in term of interfaces. That's why
 I'm not sure it's a good idea to offer container interfaces in the
 standard library.

 
 It's not that much slower.  You get a much higher speedup when things 
 can be inlined than virtual vs. non-virtual.

 
 Yes, but that was part of the equation: a call to a template function 
 can be inlined, not a virtual call (most of the time).

So if you want inlining, use the actual type, nothing is stopping you.  Well,
except the non-final functions, I have to fix that.

 One thing I just thought of -- in dcollections, similar types can be 
 compared to one another.  For example, you can check to see if a 
 HashSet is equal to a TreeSet.  But that would not be possible without 
 interfaces.

 
 I'm not sure of what you're saying here. Are you saying it can be done 
 with an interface but not with a template type? Why can't this work:

Because comparing two objects for equality now calls this function:

bool opEquals(Object obj1, Object obj2)

Which is defined in object_.d in the runtime.  No compile-time anything is
allowed.

-Steve

May 20 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

You get a much higher speedup when things can be inlined than virtual vs.
non-virtual.<

In current D compilers virtual functions prevent inlining. This can give some
performance loss.

Bye,
bearophile

May 20 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

bearophile Wrote:

 Steven Schveighoffer:
 
You get a much higher speedup when things can be inlined than virtual vs.
non-virtual.<

 
 In current D compilers virtual functions prevent inlining. This can give some
performance loss.

People seem to think that because dcollections implement interfaces, you must
never store a concrete collection instance.  This isn't true, you can use
dcollections as concrete classes without dealing with interfaces at all, and
all the functions should be inlineable (at least, they will be when I change
them all to final functions).

dcollections provide all of the available generic-ness that you need.  Ranges,
template member functions, etc.  AND they also implement interfaces if that
should be part of your design.  I've pointed out that a good use case for
passing interfaces instead of concrete classes is for dynamic libraries, where
you don't want to have private member changes affect runtime code.

-Steve

May 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/20/2010 08:22 AM, Steven Schveighoffer wrote:
 Michel Fortin Wrote:

 On 2010-05-20 06:34:42 -0400, Steven
 Schveighoffer<schveiguy yahoo.com>  said:

 I understand these points, but I'm already using interfaces to
 copy data between containers.  I don't have to, I could have used
 generic code, but this way, only one function is instantiated to
 copy data from all the other containers.  The problem with using
 generic code is that the compiler will needlessly duplicate
 functions that are identical.

 One question. Have you calculated the speed difference between
 using an interface and using generic code? Surely going through all
 those virtual calls slows things down a lot.

 I do like interfaces in principle, but I fear it'll make things
 much slower when people implement things in term of interfaces.
 That's why I'm not sure it's a good idea to offer container
 interfaces in the standard library.

 It's not that much slower.  You get a much higher speedup when things
 can be inlined than virtual vs. non-virtual.

 However, I should probably make all the functions in the concrete
 implementations final.  I made several of them final, but I should do
 it across the board.

Yup. Even Java does that. I forgot whether it was a recanting of a 
former stance (as was the case with synchronized) or things were like 
that from the get-go.

 One thing I just thought of -- in dcollections, similar types can be
 compared to one another.  For example, you can check to see if a
 HashSet is equal to a TreeSet.  But that would not be possible
 without interfaces.

Of course it would be possible. You write a generic function that takes 
two generic container and constrain the inputs such that at least one 
has element lookup capability. (Complexity-oriented design for the win!)


Andrei

May 21 2010

BCS <none anon.com> writes:

Hello Andrei,

 On 05/20/2010 08:22 AM, Steven Schveighoffer wrote:
 
 Michel Fortin Wrote:
 
 On 2010-05-20 06:34:42 -0400, Steven
 Schveighoffer<schveiguy yahoo.com>  said:
 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used
 generic code, but this way, only one function is instantiated to
 copy data from all the other containers.  The problem with using
 generic code is that the compiler will needlessly duplicate
 functions that are identical.
 

 One question. Have you calculated the speed difference between using
 an interface and using generic code? Surely going through all those
 virtual calls slows things down a lot.
 
 I do like interfaces in principle, but I fear it'll make things much
 slower when people implement things in term of interfaces. That's
 why I'm not sure it's a good idea to offer container interfaces in
 the standard library.
 

 It's not that much slower.  You get a much higher speedup when things
 can be inlined than virtual vs. non-virtual.
 
 However, I should probably make all the functions in the concrete
 implementations final.  I made several of them final, but I should do
 it across the board.
 

 Yup. Even Java does that. I forgot whether it was a recanting of a
 former stance (as was the case with synchronized) or things were like
 that from the get-go.
 
 One thing I just thought of -- in dcollections, similar types can be
 compared to one another.  For example, you can check to see if a
 HashSet is equal to a TreeSet.  But that would not be possible
 without interfaces.
 

 Of course it would be possible. You write a generic function that
 takes two generic container and constrain the inputs such that at
 least one has element lookup capability. (Complexity-oriented design
 for the win!)
 
 Andrei
 

Cool. Now how do I write code so that it will always iterate the collection 
with the bigger O() lookup time (O(n) before O(log2(n)) before O(log16(n)) 
before O(1))? :D

-- 
... <IXOYE><

May 21 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

BCS <none anon.com> wrote:

 Cool. Now how do I write code so that it will always iterate the  
 collection with the bigger O() lookup time (O(n) before O(log2(n))  
 before O(log16(n)) before O(1))? :D

Add a function.

auto foo( R1, R2 )( R1 r1, R2 r2 ) if ( R1.complexity( 10_000 ) >  
R2.complexity( 10_000 ) ) {
	...
}

auto foo( R1, R2 )( R2 r1, R1 r2 ) if ( R1.complexity( 10_000 ) <  
R2.complexity( 10_000 ) ) {
	...
}

-- 
Simen

May 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/20/2010 06:17 AM, Michel Fortin wrote:
 On 2010-05-20 06:34:42 -0400, Steven Schveighoffer <schveiguy yahoo.com>
 said:

 I understand these points, but I'm already using interfaces to copy
 data between containers. I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data
 from all the other containers. The problem with using generic code is
 that the compiler will needlessly duplicate functions that are identical.

 One question. Have you calculated the speed difference between using an
 interface and using generic code? Surely going through all those virtual
 calls slows things down a lot.

 I do like interfaces in principle, but I fear it'll make things much
 slower when people implement things in term of interfaces. That's why
 I'm not sure it's a good idea to offer container interfaces in the
 standard library.

There will be differences, but let's keep in mind that that's one of 
several arguments against container interfaces.

Andrei

May 21 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:
 Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 Does that make sense?

 -Steve

 Yes and No. I understand where your coming from, but I think it's a bad
 idea. First, I think it needlessly expands the radius of comprehension
 needed to understand and use the library. (See Tangled up in tools
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second, I
 think designing a library to be flexible enough to meet some future,
 anticipated need (e.g. dlls) is a good idea, but actually implementing
 vaporous future needs is fraught with peril; it's too easy to guess  
 wrong.
 Third, interface base design is viral; If library X uses interfaces  
 then I
 have to use interfaces to interface with it. And if another library Y  
 uses
 classes, then I'm going have to write a (needless) wrapper around one of
 them.

 I understand these points, but I'm already using interfaces to copy data  
 between containers.  I don't have to, I could have used generic code,  
 but this way, only one function is instantiated to copy data from all  
 the other containers.  The problem with using generic code is that the  
 compiler will needlessly duplicate functions that are identical.

This sounds like a failure of design. Why aren't you using ranges to do  
this?

 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in the  
 interfaces.  If a library returns an interface, the author is saying "I  
 don't want you using any functions outside this interface," so why is  
 that a bad thing?

Well, needlessly duplicated functions for one. :) More importantly, the  
example I gave was about third party libraries which I have no control  
over. So this solution explicitly doesn't work. And even if everyone used  
templates everywhere in order to be compatible with both interfaces and  
classes, isn't that a viral effect of having both?

 Forcing people to *not* use interfaces has its drawbacks too.   
 Dcollections gives the most flexible design I could muster, while still  
 being useful.

 I'm not saying I'm against removing the interfaces until some later  
 date, but I don't see any convincing arguments yet, especially since  
 I've already seen benefits from having them.

 -Steve

May 20 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 20 May 2010 12:46:59 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 Does that make sense?

 -Steve

 Yes and No. I understand where your coming from, but I think it's a bad
 idea. First, I think it needlessly expands the radius of comprehension
 needed to understand and use the library. (See Tangled up in tools
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools) Second,  
 I
 think designing a library to be flexible enough to meet some future,
 anticipated need (e.g. dlls) is a good idea, but actually implementing
 vaporous future needs is fraught with peril; it's too easy to guess  
 wrong.
 Third, interface base design is viral; If library X uses interfaces  
 then I
 have to use interfaces to interface with it. And if another library Y  
 uses
 classes, then I'm going have to write a (needless) wrapper around one  
 of
 them.

 I understand these points, but I'm already using interfaces to copy  
 data between containers.  I don't have to, I could have used generic  
 code, but this way, only one function is instantiated to copy data from  
 all the other containers.  The problem with using generic code is that  
 the compiler will needlessly duplicate functions that are identical.

 This sounds like a failure of design. Why aren't you using ranges to do  
 this?

Why are ranges necessarily better?  I'm using the container's opApply,  
which I'd probably still use even if there were no interfaces.  opApply  
allows much more possibilities in traversal than ranges which cannot use  
stack recursion without heap activity.

 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in the  
 interfaces.  If a library returns an interface, the author is saying "I  
 don't want you using any functions outside this interface," so why is  
 that a bad thing?

 Well, needlessly duplicated functions for one. :) More importantly, the  
 example I gave was about third party libraries which I have no control  
 over. So this solution explicitly doesn't work. And even if everyone  
 used templates everywhere in order to be compatible with both interfaces  
 and classes, isn't that a viral effect of having both?

If a 3rd party library uses interfaces, it's probably for good reason.   
They most likely want to remain binary compatible with other libs, and/or  
want to abstract the implementation of some custom container type.  If you  
don't like their requirements, don't use the library.

-Steve

May 24 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Mon, 24 May 2010 08:06:29 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:
 On Thu, 20 May 2010 12:46:59 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:


[snip]
 I understand these points, but I'm already using interfaces to copy  
 data between containers.  I don't have to, I could have used generic  
 code, but this way, only one function is instantiated to copy data  
 from all the other containers.  The problem with using generic code is  
 that the compiler will needlessly duplicate functions that are  
 identical.

 This sounds like a failure of design. Why aren't you using ranges to do  
 this?

 Why are ranges necessarily better?  I'm using the container's opApply,  
 which I'd probably still use even if there were no interfaces.  opApply  
 allows much more possibilities in traversal than ranges which cannot use  
 stack recursion without heap activity.

Ranges are not necessarily better and may have some minor amount of  
overhead over a well optimized opApply. Then again, the opposite may be  
true. The point is that if the difference between opApply and a range is  
more than trivial, you've probably had a failure of design occur.  
Honestly, I'm having trouble thinking of a container which allows stack  
recursion for transversal and doesn't have an efficient range/loop  
variant. For that matter, a range using heap activity is also a clear  
indication of a design failure.

 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in  
 the interfaces.  If a library returns an interface, the author is  
 saying "I don't want you using any functions outside this interface,"  
 so why is that a bad thing?

 Well, needlessly duplicated functions for one. :) More importantly, the  
 example I gave was about third party libraries which I have no control  
 over. So this solution explicitly doesn't work. And even if everyone  
 used templates everywhere in order to be compatible with both  
 interfaces and classes, isn't that a viral effect of having both?

 If a 3rd party library uses interfaces, it's probably for good reason.   
 They most likely want to remain binary compatible with other libs,  
 and/or want to abstract the implementation of some custom container  
 type.  If you don't like their requirements, don't use the library.

 -Steve

No, if a 3rd party library _needs_ to use interfaces it's probably for a  
good reason. The problem is, if they exist, people are going to use them  
even if they don't need them. Which therein lies the problem.

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 12:06:18 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Mon, 24 May 2010 08:06:29 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:
 On Thu, 20 May 2010 12:46:59 -0400, Robert Jacques <sandford jhu.edu>  
 wrote:

 On Thu, 20 May 2010 06:34:42 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:


 [snip]
 I understand these points, but I'm already using interfaces to copy  
 data between containers.  I don't have to, I could have used generic  
 code, but this way, only one function is instantiated to copy data  
 from all the other containers.  The problem with using generic code  
 is that the compiler will needlessly duplicate functions that are  
 identical.

 This sounds like a failure of design. Why aren't you using ranges to  
 do this?

 Why are ranges necessarily better?  I'm using the container's opApply,  
 which I'd probably still use even if there were no interfaces.  opApply  
 allows much more possibilities in traversal than ranges which cannot  
 use stack recursion without heap activity.

 Ranges are not necessarily better and may have some minor amount of  
 overhead over a well optimized opApply. Then again, the opposite may be  
 true. The point is that if the difference between opApply and a range is  
 more than trivial, you've probably had a failure of design occur.

The difference is trivial.  I support both ranges and opApply.  The main  
benefit from using opApply being only one function is compiled by the  
compiler.

 Using interfaces is not as viral as you think.  My interfaces can be  
 used in generic code, as long as the generic code uses functions in  
 the interfaces.  If a library returns an interface, the author is  
 saying "I don't want you using any functions outside this interface,"  
 so why is that a bad thing?

 Well, needlessly duplicated functions for one. :) More importantly,  
 the example I gave was about third party libraries which I have no  
 control over. So this solution explicitly doesn't work. And even if  
 everyone used templates everywhere in order to be compatible with both  
 interfaces and classes, isn't that a viral effect of having both?

 If a 3rd party library uses interfaces, it's probably for good reason.   
 They most likely want to remain binary compatible with other libs,  
 and/or want to abstract the implementation of some custom container  
 type.  If you don't like their requirements, don't use the library.

 -Steve

 No, if a 3rd party library _needs_ to use interfaces it's probably for a  
 good reason. The problem is, if they exist, people are going to use them  
 even if they don't need them. Which therein lies the problem.

And if 3rd party X _needs_ to use interfaces, and 3rd party Y _needs_ to  
use interfaces, and your code depends on X and Y, and interfaces aren't  
defined by dcollections, where are you then?  I don't think there's any  
way you can define a generic standard library such that people won't  
create bad designs, that's a lost cause.

I think part of the problem with all this is that interfaces aren't likely  
to be needed any time soon (D's dynamic lib support is almost  
non-existent), so I'm probably going to drop the interfaces for now in the  
interest of progress.

-Steve

May 24 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Steven Schveighoffer <schveiguy yahoo.com> wrote:

 And if 3rd party X _needs_ to use interfaces, and 3rd party Y _needs_ to  
 use interfaces, and your code depends on X and Y, and interfaces aren't  
 defined by dcollections, where are you then?

You're at the point where the language allows you to create a class
following an interface, which whose all methods redirect to those of
the struct. This could even be done automagically.

-- 
Simen

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 13:11:27 -0400, Simen kjaeraas  
<simen.kjaras gmail.com> wrote:

 Steven Schveighoffer <schveiguy yahoo.com> wrote:

 And if 3rd party X _needs_ to use interfaces, and 3rd party Y _needs_  
 to use interfaces, and your code depends on X and Y, and interfaces  
 aren't defined by dcollections, where are you then?

 You're at the point where the language allows you to create a class
 following an interface, which whose all methods redirect to those of
 the struct. This could even be done automagically.

A situation Robert has stated he would like to avoid.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/20/2010 05:34 AM, Steven Schveighoffer wrote:
 Robert Jacques Wrote:

 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 Does that make sense?

 -Steve

 Yes and No. I understand where your coming from, but I think it's a
 bad idea. First, I think it needlessly expands the radius of
 comprehension needed to understand and use the library. (See
 Tangled up in tools
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools)
 Second, I think designing a library to be flexible enough to meet
 some future, anticipated need (e.g. dlls) is a good idea, but
 actually implementing vaporous future needs is fraught with peril;
 it's too easy to guess wrong. Third, interface base design is
 viral; If library X uses interfaces then I have to use interfaces
 to interface with it. And if another library Y uses classes, then
 I'm going have to write a (needless) wrapper around one of them.

 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data
 from all the other containers.  The problem with using generic code
 is that the compiler will needlessly duplicate functions that are
 identical.

There is a copy() function that copies any range to any other range. It
might need a revisit, but I think the way to go about copying is generic.

Let's not forget that the code for copying itself is rather short and
that applications don't tend to use a large number of container types.

 Using interfaces is not as viral as you think.  My interfaces can be
 used in generic code, as long as the generic code uses functions in
 the interfaces.  If a library returns an interface, the author is
 saying "I don't want you using any functions outside this interface,"
 so why is that a bad thing?

 Forcing people to *not* use interfaces has its drawbacks too.
 Dcollections gives the most flexible design I could muster, while
 still being useful.

 I'm not saying I'm against removing the interfaces until some later
 date, but I don't see any convincing arguments yet, especially since
 I've already seen benefits from having them.

What are the benefits that you have noticed? I think you'd need to back 
up the copying argument with some data if you want to frame it as a benefit.


Andrei

May 21 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 21 May 2010 14:00:42 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/20/2010 05:34 AM, Steven Schveighoffer wrote:

 I understand these points, but I'm already using interfaces to copy
 data between containers.  I don't have to, I could have used generic
 code, but this way, only one function is instantiated to copy data
 from all the other containers.  The problem with using generic code
 is that the compiler will needlessly duplicate functions that are
 identical.

 There is a copy() function that copies any range to any other range. It
 might need a revisit, but I think the way to go about copying is generic.

This implies the space for the elements already exists in the target  
range.  Copying data from a list to a set for instance can't just allocate  
space in the set and then use some generic copy function to copy the data  
in.

If generic copying is to be used, it would be a copy function defined by  
the collection, not a standard one.

 Using interfaces is not as viral as you think.  My interfaces can be
 used in generic code, as long as the generic code uses functions in
 the interfaces.  If a library returns an interface, the author is
 saying "I don't want you using any functions outside this interface,"
 so why is that a bad thing?

 Forcing people to *not* use interfaces has its drawbacks too.
 Dcollections gives the most flexible design I could muster, while
 still being useful.

 I'm not saying I'm against removing the interfaces until some later
 date, but I don't see any convincing arguments yet, especially since
 I've already seen benefits from having them.

 What are the benefits that you have noticed? I think you'd need to back  
 up the copying argument with some data if you want to frame it as a  
 benefit.

When I mean see benefits, the benefits are not data-related but design  
related -- more designs are possible with interfaces and generic  
programming combined than with generic programming alone.

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 09:59 PM, Robert Jacques wrote:
 On Wed, 19 May 2010 21:42:35 -0400, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

 This might have a simple answer. Dcollections implementations are not
 a hierarchy, just the interfaces are. I.e. there aren't many kinds of
 HashMaps that derive from each other. But the interfaces are not
 detrimental to your ideas. The only thing interfaces require is that
 the entities implementing them are classes and not structs. As long as
 you agree that classes are the right call, then interfaces can
 co-exist with your other suggestions without interference.

 Yes, if you want to define "this function needs something that is both
 addable and purgeable, I don't have an interface for that. But a
 concept can certainly define that generically (which is what you want
 anyways), or you could just say "I need a List" and get those
 functions also. It also does not force entities other than
 dcollections objects to be classes, they could be structs and
 implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it is
 mostly a carryover from the Java/Tango inspirations. But I can see one
 good reason to keep them -- binary interoperability. For example, it
 might be the case some day when D has good support with dynamic
 libraries that a library exposes some piece of itself as a Map or List
 interface.

 So my answer is -- go ahead and define these concepts and required
 names, and you can ignore the interfaces if they don't interest you.
 They do not subtract from the possibilities, and others may find good
 use for them.

 Does that make sense?

 -Steve

 Yes and No. I understand where your coming from, but I think it's a bad
 idea. First, I think it needlessly expands the radius of comprehension
 needed to understand and use the library. (See Tangled up in tools
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools)

For the record, I strongly agree with this.

 Second, I
 think designing a library to be flexible enough to meet some future,
 anticipated need (e.g. dlls) is a good idea, but actually implementing
 vaporous future needs is fraught with peril; it's too easy to guess
 wrong. Third, interface base design is viral; If library X uses
 interfaces then I have to use interfaces to interface with it. And if
 another library Y uses classes, then I'm going have to write a
 (needless) wrapper around one of them.

That's a good argument as well. I like to put it a different way: you 
can get the advantages of an interface by wrapping a struct, but you 
can't get the advantages of a struct by wrapping an interface.


Andrei

May 21 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 On 05/19/2010 09:59 PM, Robert Jacques wrote:
 Yes and No. I understand where your coming from, but I think it's a bad
 idea. First, I think it needlessly expands the radius of comprehension
 needed to understand and use the library. (See Tangled up in tools
 http://www.pragprog.com/magazines/2010-04/tangled-up-in-tools)

 
 For the record, I strongly agree with this.

I do too, but that's the easy part. Living up to those ideals is extremely
hard, 
usually because most designers think they have designed simple interfaces when 
everyone else thinks they didn't.

In this whole container discussion, I'd like to point out something Andrei 
pointed out to me. The one extremely successful example of pluggable components 
is the unix filter model. Essentially, one program pipes its output to the
next, 
which pipes its output to the next, etc. The unix console is designed around 
that paradigm.

If we can get anywhere close to that level of success with ranges and 
containers, we should all be well pleased.

May 21 2010

Walter Bright <newshound1 digitalmars.com> writes:

Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

Mike Taylor has a phrase for that I think is well-coined: "impedance matching",
defined as the work necessary to get one library module to work with another
library module.

One example of bad impedance matching is C++ iostreams' attempt to make a memory
buffer look like a file. Phobos propagated that mistake in its own streams.

May 21 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-21 22:55:16 -0400, Walter Bright <newshound1 digitalmars.com> said:

 Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

 
 Mike Taylor has a phrase for that I think is well-coined: "impedance matching",
 defined as the work necessary to get one library module to work with another
 library module.

This makes me think about something.

In principle, I like the idea of containers being reference type. It 
works well when passing a container to functions. But at the same time, 
I despite it. By-reference containers forces you to have extra 
indirections even when you don't need them, and you have to worry about 
null. Sometime a value-type would be better, when creating more complex 
data structures for instance:

	class Channel {
		private {
			Array!Message inbound;
			Array!Message outbound;
		}

		...
	}

What's the point of having extra indirection here?

I've been thinking about having both by-value and by-reference 
containers. The first-class name would be given to the by-reference 
container to give it more visibility, but that by-reference container 
would be a simple wrapper for a by-value container "part" implemented 
in a struct:

	struct ArrayPart(T) { ... } // by value array container.

	class Array(T) {
		ArrayPart!T part;
		alias part this;
	}

So now, if you want to reuse a container "part" to build some kind of 
more complex data structure, you can do it like this:

	class Channel {
		private {
			ArrayPart!Message inbound;
			ArrayPart!Message outbound;
		}

		...
	}

No silly extra indirection.

That said, all this gives me a feeling of an overcomplicated design. 
After all, the problem is that you want to pass the container by 
reference in function arguments, but it's __too easy to forget the 
ref__. Perhaps that's the problem that should be fixed.

Couldn't we just make a struct that cannot be implicitly copied? 
Perhaps something like this:

	 explicitdup struct Array {  }

	void testVal(Array array);
	void testRef(ref Array array);

	unittest {
		Array array;
		testVal(array);     // error, cannot copy array implicitly
		testVal(array.dup); // ok, array is copied
		testRef(array);     // ok, array is passed by reference
	}

If there's already a way to achieve this, I couldn't find it.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 22 2010

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:
 and you have to worry about null.

Nonnull references of course solve this problem.


 Couldn't we just make a struct that cannot be implicitly copied? 

The  disable attribute was invented for this purpose too:

struct Foo {
    int x;
     disable this(this) {}
}
void main() {
    Foo f1;
    Foo f2 = f1;
}


That prints:
test.d(7): Error: struct test9.Foo is not copyable because it is annotated with
 disable

Bye,
bearophile

May 22 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-22 08:16:27 -0400, bearophile <bearophileHUGS lycos.com> said:

 Michel Fortin:
 Couldn't we just make a struct that cannot be implicitly copied?

 
 The  disable attribute was invented for this purpose too:
 
 struct Foo {
     int x;
      disable this(this) {}
 }
 void main() {
     Foo f1;
     Foo f2 = f1;
 }
 
 
 That prints:
 test.d(7): Error: struct test9.Foo is not copyable because it is 
 annotated with  disable

Indeed, thanks. I figured it out by myself while you were writing this.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 22 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-22 07:56:31 -0400, Michel Fortin <michel.fortin michelf.com> said:

 	 explicitdup struct Array {  }
 
 	void testVal(Array array);
 	void testRef(ref Array array);
 
 	unittest {
 		Array array;
 		testVal(array);     // error, cannot copy array implicitly
 		testVal(array.dup); // ok, array is copied
 		testRef(array);     // ok, array is passed by reference
 	}
 
 If there's already a way to achieve this, I couldn't find it.

Apparently it's already achievable this way:

	struct Array {
		 disable this(this);
		Array dup() { return Array(...); }
		...
	}

It also blocks simple assignments:

	Array array2 = array;     // error, array is not copyable
	Array array3 = array.dup; // ok

With this, I don't think we need containers to be reference types 
anymore. The naive error of copying containers everywhere without you 
knowing about it (as it occurs in C++) is simply no longer possible.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 22 2010

Walter Bright <newshound1 digitalmars.com> writes:

Michel Fortin wrote:
 On 2010-05-21 22:55:16 -0400, Walter Bright <newshound1 digitalmars.com> 
 said:
 
 Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

 Mike Taylor has a phrase for that I think is well-coined: "impedance 
 matching",
 defined as the work necessary to get one library module to work with 
 another
 library module.

 
 This makes me think about something.
 
 In principle, I like the idea of containers being reference type. It 
 works well when passing a container to functions. But at the same time, 
 I despite it. By-reference containers forces you to have extra 
 indirections even when you don't need them, and you have to worry about 
 null. Sometime a value-type would be better, when creating more complex 
 data structures for instance:
 
     class Channel {
         private {
             Array!Message inbound;
             Array!Message outbound;
         }
 
         ...
     }
 
 What's the point of having extra indirection here?

Good question. I think the answer is:

1. When do you ever want to copy a collection? I almost never do, because 
copying one is an inherently expensive operation.

2. When you copy a collection, do you copy the container or the elements in the 
container or both? One would have to carefully read the documentation to see 
which it is. That increases substantially the "cognitive load" of using them.

3. Going to lengths to make them value types, but then disabling copying them 
because you want people to use them as reference types, seems like a failure of 
design somewhere.

4. That silly extra level of indirection actually isn't there. Consider that 
even value locals are accessed via indirection: offset[ESP]. For a reference 
collection we have: offset[EBX]. No difference (yes, EBX has to be loaded, but 
if it is done more than once it gets cached in a register by the compiler).

5. Just making them all reference types means the documentation and use become
a 
lot simpler.

May 22 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-22 16:01:37 -0400, Walter Bright <newshound1 digitalmars.com> said:

 Michel Fortin wrote:
 What's the point of having extra indirection here?

 
 Good question. I think the answer is:
 
 1. When do you ever want to copy a collection? I almost never do, 
 because copying one is an inherently expensive operation.

Whenever you need to preserve the previous state of something before 
applying some transformation on it. But I agree that the copy should be 
explicit because it is O(n), hence my suggestion of disabling implicit 
copying for containers.

Since we're at it, a reference types container sometime makes it too 
easy to just create a new reference to the same container when what you 
really want is to make a copy. I happen to have a bug of this sort to 
fix in my Objective-C program right now where a reference to a 
container leaked where it should have been a copy, causing unwanted 
mutations to the .


 2. When you copy a collection, do you copy the container or the 
 elements in the container or both? One would have to carefully read the 
 documentation to see which it is. That increases substantially the 
 "cognitive load" of using them.

I don't see the extra cognitive load. Assuming you disable implicit 
copying of the container, you'll have to use ".dup", which will work 
exactly as an array. The way items are copied are exactly the same as 
if the container was a reference type (you call a "dup" or equivalent 
function and things get copied).


 3. Going to lengths to make them value types, but then disabling 
 copying them because you want people to use them as reference types, 
 seems like a failure of design somewhere.

I agree in a way. But at the same time, forcing everyone to use a 
reference type when sometime a value-type would be more adequate also 
looks like a failure of design to me. To me, the best tradeoff seems to 
use a value-type because it's quite trivial to create a reference-type 
from a value type when you need it; the reverse is awkward.


 4. That silly extra level of indirection actually isn't there. Consider 
 that even value locals are accessed via indirection: offset[ESP]. For a 
 reference collection we have: offset[EBX]. No difference (yes, EBX has 
 to be loaded, but if it is done more than once it gets cached in a 
 register by the compiler).

Have you ever worked with containers of containers? Surely yes since D 
associative arrays are one of them. So assume we want to implement our 
associative arrays like this:

	class HashTable(Key, Value) {
		Array!(Tuple!(Hash!Key, TreeSet!(Tuple!(Key, Value)))) buckets;
	}

Do you find it reasonable that the TreeSet be a reference type?

Reference-type containers would mean one indirection and one extra 
allocated block for each bucket. Then add that 'Value' could itself be 
a struct or class containing its own container, and you're stuck with a 
third unnecessary level of indirection and extra calls to the GC 
allocate containers and/or check for null. Sound quite wasteful to me. 
In addition, those extra allocations add more logic to our hash table 
and thus more chances for bugs.

Here I'm using a hash table as an example, the same problem applies to 
many other data structures, whether they are generic or specific to a 
particular problem. Container should be efficient and easy to use when 
composed one inside another. That's the greatest strengths of C++ 
value-type containers in my opinion.


 5. Just making them all reference types zeans the documentation and use 
 become a lot simpler.

Simpler to explain, maybe. Simpler to use, I have my doubts. You're 
just moving the burden to somewhere else. A reference-type container 
requires a "new Container()" somewhere, and some protection logic 
against null. In exchange, you don't need to write 'ref' in functions 
taking containers, and can easily copy references to the container 
everywhere (sometime too easily). But the reference-type benefits 
aren't entirely lost with a value-type, because it's trivial to change 
a value-type as a reference-type.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 23 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 22 May 2010 07:56:31 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2010-05-21 22:55:16 -0400, Walter Bright <newshound1 digitalmars.com>  
 said:

 Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and  
 containers, we should all be well pleased.

  Mike Taylor has a phrase for that I think is well-coined: "impedance  
 matching",
 defined as the work necessary to get one library module to work with  
 another
 library module.

 This makes me think about something.

 In principle, I like the idea of containers being reference type. It  
 works well when passing a container to functions. But at the same time,  
 I despite it. By-reference containers forces you to have extra  
 indirections even when you don't need them, and you have to worry about  
 null. Sometime a value-type would be better, when creating more complex  
 data structures for instance:

 	class Channel {
 		private {
 			Array!Message inbound;
 			Array!Message outbound;
 		}

 		...
 	}

 What's the point of having extra indirection here?

Scope class members should solve this.  It's been thrown around forever --  
essentially, you should be able to define that a class member's storage is  
contained within the owner object.

a value-type node-based containers just don't work well -- it's too easy  
to escape references, and too easy to accidentally duplicate the entire  
node set when passing to functions.  It works fine for arrays because the  
array has a very simple structure -- data and length.  And the length and  
data are somewhat orthogonal.

-Steve

May 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 Scope class members should solve this.  It's been thrown around forever --  
 essentially, you should be able to define that a class member's storage is  
 contained within the owner object.

I too have proposed this idea, but in my opinion you can't design a big part of
a standard lib on the base of a syntax/language optimization that doesn't
exists yet.

Bye,
bearophile

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 13:04:31 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 Scope class members should solve this.  It's been thrown around forever  
 --
 essentially, you should be able to define that a class member's storage  
 is
 contained within the owner object.

 I too have proposed this idea, but in my opinion you can't design a big  
 part of a standard lib on the base of a syntax/language optimization  
 that doesn't exists yet.

Yeah, but essentially, it's an optimization.  Whether the storage is  
stored in the same heap block or a different one really doesn't matter in  
terms of functionality.  Allocating one heap block vs. two is faster,  
that's what the OP was saying.

In other words, scope class members aren't essential to using dcollections.

-Steve

May 24 2010

Walter Bright <newshound1 digitalmars.com> writes:

Walter Bright wrote:
 If we can get anywhere close to that level of success with ranges and 
 containers, we should all be well pleased.

Mike Taylor has a phrase for that I think is well-coined: "impedance matching", 
defined as the work necessary to get one library module to work with another 
library module.

One example of bad impedance matching is C++ iostreams' attempt to make a
memory 
buffer look like a file. Phobos propagated that mistake in its own streams. A 
better way is to use ranges, where the file is accessed via a range and a
memory 
buffer is accessed via a range. Then anything that operates on data is written 
to the range, not a fake file interface.

May 21 2010

superdan <super dan.org> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Andrei Alexandrescu Wrote:
 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

 This might have a simple answer.  Dcollections implementations are not a

hierarchy, just the interfaces are.  I.e. there aren't many kinds of HashMaps
that
derive from each other.  But the interfaces are not detrimental to your ideas.
The only thing interfaces require is that the entities implementing them are
classes and not structs.  As long as you agree that classes are the right call,
then interfaces can co-exist with your other suggestions without interference.

classes suck ass. structs give ye freedom 2 define copy ctors n shit. haven't
seen
andre agreein' classes are da way 2 go and i hope he don't. anyway u put
together
some cool shit. hope andre & u do a pow-wow n shit and adjust shit fer puttin'
into phobos.

May 20 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

superdan Wrote:

 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Andrei Alexandrescu Wrote:
 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

 This might have a simple answer.  Dcollections implementations are not a

 hierarchy, just the interfaces are.  I.e. there aren't many kinds of HashMaps
that
 derive from each other.  But the interfaces are not detrimental to your ideas.
 The only thing interfaces require is that the entities implementing them are
 classes and not structs.  As long as you agree that classes are the right call,
 then interfaces can co-exist with your other suggestions without interference.
 
 classes suck ass. structs give ye freedom 2 define copy ctors n shit. haven't
seen
 andre agreein' classes are da way 2 go and i hope he don't. anyway u put
together
 some cool shit. hope andre & u do a pow-wow n shit and adjust shit fer puttin'
 into phobos.

I think classes are the right move.  First, a collection makes more sense as a
reference type.  Note that both arrays and associative arrays are reference
types.  If collections are value types, like in C++, then copying a collection
that is a node-based collection means duplicating all the nodes.  Copy
construction is essentially possible through a function -- dup.  Having
value-semantics makes it too easy to copy large amounts of heap data hurting
performance.  Many inexperienced C++ coders pass a std::set by value, not
realizing why their code is so ridiculously slow.  I think one of the things
that makes D so damn fast is because large data structures such as arrays and
AA's are always passed by reference.

Second, since reference types are the right thing to do, classes are much
easier to deal with.  I know AA's are reference types that are structs, but the
code needed to perform this feat is not trivial.  The AA has only one member, a
reference to the data struct, which is allocated on the heap.  Any member
function/property that is used on the AA must first check whether the
implementation is allocated yet.  The only benefit this gives you IMO is not
having to use 'new' on it.  And even that has some drawbacks.  For example,
pass an empty AA by value to a function, and if that function adds any data to
it, it is lost.  But pass an AA by value with one element in it, and the new
data sticks.  A class gives you much more in terms of options -- interfaces,
builtin synchronization, runtime comparison, etc.  And it forces full reference
semantics by default.  I think regardless of whether interfaces are defined for
dcollections, classes give a better set of options than structs.

Also note that I intend to make all dcollections classes final, so there will
be no virtual calls as long as you have a reference to the concrete type.

Is there some other reason to use structs besides copy construction?

-Steve

May 21 2010

superdan <super dan.org> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Andrei Alexandrescu Wrote:
 To get back to one of my earlier points, the fact that the container
 interfaces are unable to express iteration is a corollary of the
 design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality. Then a
 few concept checks (a la std.range checks) can easily express what
 capabilities a given client function needs from a container.

 This might have a simple answer.  Dcollections implementations are not a

 hierarchy, just the interfaces are.  I.e. there aren't many kinds of HashMaps
that
 derive from each other.  But the interfaces are not detrimental to your ideas.
 The only thing interfaces require is that the entities implementing them are
 classes and not structs.  As long as you agree that classes are the right call,
 then interfaces can co-exist with your other suggestions without interference.

 classes suck ass. structs give ye freedom 2 define copy ctors n shit. haven't
seen
 andre agreein' classes are da way 2 go and i hope he don't. anyway u put
together
 some cool shit. hope andre & u do a pow-wow n shit and adjust shit fer puttin'
 into phobos.

 I think classes are the right move.  First, a collection makes more sense as a

reference type.  Note that both arrays and associative arrays are reference
types.
 If collections are value types, like in C++, then copying a collection that is
a
node-based collection means duplicating all the nodes.  Copy construction is
essentially possible through a function -- dup.  Having value-semantics makes it
too easy to copy large amounts of heap data hurting performance.  Many
inexperienced C++ coders pass a std::set by value, not realizing why their code
is
so ridiculously slow.  I think one of the things that makes D so damn fast is
because large data structures such as arrays and AA's are always passed by
reference.
 Second, since reference types are the right thing to do, classes are much
easier

to deal with.  I know AA's are reference types that are structs, but the code
needed to perform this feat is not trivial.  The AA has only one member, a
reference to the data struct, which is allocated on the heap.  Any member
function/property that is used on the AA must first check whether the
implementation is allocated yet.  The only benefit this gives you IMO is not
having to use 'new' on it.  And even that has some drawbacks.  For example, pass
an empty AA by value to a function, and if that function adds any data to it, it
is lost.  But pass an AA by value with one element in it, and the new data
sticks.
 A class gives you much more in terms of options -- interfaces, builtin
synchronization, runtime comparison, etc.  And it forces full reference
semantics
by default.  I think regardless of whether interfaces are defined for
dcollections, classes give a better set of options than structs.
 Also note that I intend to make all dcollections classes final, so there will
be

no virtual calls as long as you have a reference to the concrete type.
 Is there some other reason to use structs besides copy construction?
 -Steve

memory management n shit. with a struct u can use refcounting n malloc n
realloc n
shit. still stays a reference type. nothing gets fucked up.

den there's all that null ref shit. with a class u have

void foo(container!shit poo)
{
    poo.addElement(Shit(diarrhea));
}

dat works with struct but don't work with motherfucking classes. u need to
write.

void foo(container!shit poo)
{
    if (!poo) poo = new container!shit; // fuck dat shit
    poo.addElement(Shit(diarrhea));
}

u feel me?

May 21 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

superdan Wrote:

 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article

 Is there some other reason to use structs besides copy construction?
 -Steve

 
 memory management n shit. with a struct u can use refcounting n malloc n
realloc n
 shit. still stays a reference type. nothing gets fucked up.

This is not necessary with purely memory-based constructs -- the GC is your
friend.  The custom allocator ability in dcollections should provide plenty of
freedom for memory allocation schemes.

 
 den there's all that null ref shit. with a class u have
 
 void foo(container!shit poo)
 {
     poo.addElement(Shit(diarrhea));
 }
 
 dat works with struct but don't work with motherfucking classes. u need to
write.
 
 void foo(container!shit poo)
 {
     if (!poo) poo = new container!shit; // fuck dat shit
     poo.addElement(Shit(diarrhea));
 }
 
 u feel me?

It doesn't work.  Just with your example, you won't get what you expect.  Try
it with AA's.

Even your class example doesn't make any sense.

-Steve

May 21 2010

superdan <super dan.org> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Is there some other reason to use structs besides copy construction?
 -Steve

 memory management n shit. with a struct u can use refcounting n malloc n
realloc n
 shit. still stays a reference type. nothing gets fucked up.

 This is not necessary with purely memory-based constructs -- the GC is your

friend.  The custom allocator ability in dcollections should provide plenty of
freedom for memory allocation schemes.

how do u set up yer custom allocator to free memory? u cant tell when its ok.
copying refs iz under da radar. dats my point.

 den there's all that null ref shit. with a class u have

 void foo(container!shit poo)
 {
     poo.addElement(Shit(diarrhea));
 }

 dat works with struct but don't work with motherfucking classes. u need to
write.

 void foo(container!shit poo)
 {
     if (!poo) poo = new container!shit; // fuck dat shit
     poo.addElement(Shit(diarrhea));
 }

 u feel me?

 It doesn't work.

wut? it don't work? whaddaya mean it dun work? is you crazy? what dun work?
maybe
therez sum misundercommunication. repeating. if container is struct this shit
works:

void foo(container!shit poo)
{
    poo.addElement(Shit(diarrhea));
}

dun tell me it dun work. i dun explain shit again. it works coz a struct cant be
null. but a struct can be a ref if it only haz one pointer inside. methinks the
builtin hash iz dat way.

if container iz class dat shit dun work. u need to write dis shit:

void foo(container!shit poo)
{
    if(!poo) poo = new container!shit; // fuck dat shit
    poo.addElement(Shit(diarrhea));
}

dat sucks bull ballz.

May 21 2010

BCS <none anon.com> writes:

Hello superdan,


 dun tell me it dun work. i dun explain shit again. it works coz a
 struct cant be null. but a struct can be a ref if it only haz one
 pointer inside. methinks the builtin hash iz dat way.
 

 void foo(container!shit poo)
 {
 if(!poo) poo = new container!shit; // fuck dat shit
 poo.addElement(Shit(diarrhea));
 }
 dat sucks bull ballz.

That code is broken, if poo is of a class type, that just new's a container, 
adds an element, and then drops the reference to get GC'ed. To many it do 
anything generally useful, it would need to have another level of indirection.

-- 
... <IXOYE><

May 21 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/21/2010 09:14 AM, Steven Schveighoffer wrote:
 Second, since reference types are the right thing to do, classes are much
easier to deal with.  I know AA's are reference types that are structs, but the
code needed to perform this feat is not trivial.  The AA has only one member, a
reference to the data struct, which is allocated on the heap.  Any member
function/property that is used on the AA must first check whether the
implementation is allocated yet.  The only benefit this gives you IMO is not
having to use 'new' on it.  And even that has some drawbacks.  For example,
pass an empty AA by value to a function, and if that function adds any data to
it, it is lost.  But pass an AA by value with one element in it, and the new
data sticks.  A class gives you much more in terms of options -- interfaces,
builtin synchronization, runtime comparison, etc.  And it forces full reference
semantics by default.  I think regardless of whether interfaces are defined for
dcollections, classes give a better set of options than structs.


Wow. A partially-nullable type.

Great. Now I have to review everywhere I ever used an AA. Thanks, D.

is there any serious drawback to something like

(int[int]).init = InitializedAA!(int,int)

?

May 21 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/21/2010 11:55 AM, Ellery Newcomer wrote:
 On 05/21/2010 09:14 AM, Steven Schveighoffer wrote:
 Second, since reference types are the right thing to do, classes are
 much easier to deal with. I know AA's are reference types that are
 structs, but the code needed to perform this feat is not trivial. The
 AA has only one member, a reference to the data struct, which is
 allocated on the heap. Any member function/property that is used on
 the AA must first check whether the implementation is allocated yet.
 The only benefit this gives you IMO is not having to use 'new' on it.
 And even that has some drawbacks. For example, pass an empty AA by
 value to a function, and if that function adds any data to it, it is
 lost. But pass an AA by value with one element in it, and the new data
 sticks. A class gives you much more in terms of options -- interfaces,
 builtin synchronization, runtime comparison, etc. And it forces full
 reference semantics by default. I think regardless of whether
 interfaces are defined for dcollections, classes give a better set of
 options than structs.


 Wow. A partially-nullable type.

 Great. Now I have to review everywhere I ever used an AA. Thanks, D.

 is there any serious drawback to something like

 (int[int]).init = InitializedAA!(int,int)

 ?

Or should one just always give an AA param either a const or ref modifier?

May 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 To get back to one of my earlier points, the fact that the
 container interfaces are unable to express iteration is a corollary
 of the design's problem and an climactic disharmony.

 My vision, in very brief, is to foster a federation of independent
 containers abiding to identical names for similar functionality.
 Then a few concept checks (a la std.range checks) can easily
 express what capabilities a given client function needs from a
 container.

 This might have a simple answer.  Dcollections implementations are
 not a hierarchy, just the interfaces are.

Without final, they are the roots of a hierarchy. But I understand you 
are making containers final, which is great.

 I.e. there aren't many
 kinds of HashMaps that derive from each other.  But the interfaces
 are not detrimental to your ideas.  The only thing interfaces require
 is that the entities implementing them are classes and not structs.
 As long as you agree that classes are the right call, then interfaces
 can co-exist with your other suggestions without interference.

This brings back a discussion I had with Walter a while ago, with echoes 
in the newsgroup. Basically the conclusion was as follows: if a 
container never escapes the addresses of its elements, it can manage its 
own storage. That forces, however, the container to be a struct because 
copying references to a class container would break that encapsulation. 
I called those "perfectly encapsulated containers" and I think they are 
good candidates for manual memory management because they tend to deal 
in relatively large chunks.

I noticed that your collections return things by value, so they are good 
candidates for perfect encapsulation.

 Yes, if you want to define "this function needs something that is
 both addable and purgeable, I don't have an interface for that.  But
 a concept can certainly define that generically (which is what you
 want anyways), or you could just say "I need a List" and get those
 functions also.  It also does not force entities other than
 dcollections objects to be classes, they could be structs and
 implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it is
 mostly a carryover from the Java/Tango inspirations.

I don't know Tango, but Java's containers are a terrible example to 
follow. Java's container library is a ill-advised design on top of an 
underpowered language, patched later with some half-understood seeming 
of genericity. I think Java containers are a huge disservice to the 
programming community because they foster bad design.

 But I can see
 one good reason to keep them -- binary interoperability.  For
 example, it might be the case some day when D has good support with
 dynamic libraries that a library exposes some piece of itself as a
 Map or List interface.

I need to disagree with that. I've done and I do a ton of binary 
interoperability stuff. You never expose a generic container interface! 
Interoperable objects always embody high-level logic that is specific to 
the application. They might use containers inside, but they invariably 
expose high-level, application-specific functionality.

 So my answer is -- go ahead and define these concepts and required
 names, and you can ignore the interfaces if they don't interest you.
 They do not subtract from the possibilities, and others may find good
 use for them.

 Does that make sense?

I understand I could ignore the interfaces and call it a day, but it 
seems that at this point we are both convinced they are not quite good 
at anything: you only put them in because you suffered the Stockholm 
syndrome with Java, and I hate them with a passion.

Why would we keep in the standard library bad design with the advice 
that "if you don't like it ignore it"?


Andrei

May 21 2010

Sean Kelly <sean invisibleduck.org> writes:

Andrei Alexandrescu Wrote:
 
 I don't know Tango, but Java's containers are a terrible example to 
 follow. Java's container library is a ill-advised design on top of an 
 underpowered language, patched later with some half-understood seeming 
 of genericity. I think Java containers are a huge disservice to the 
 programming community because they foster bad design.

Tango's container library is or was a port of Doug Lea's containers for Java. 
While Doug Lea is an absolute master with concurrency, I never liked his
container library very much.  As you've said, the common abstractions it draws
are weird and not terribly useful.
 
 I need to disagree with that. I've done and I do a ton of binary 
 interoperability stuff. You never expose a generic container interface! 
 Interoperable objects always embody high-level logic that is specific to 
 the application. They might use containers inside, but they invariably 
 expose high-level, application-specific functionality.

This.  Iterators (or ranges) are passed all over the place, but when operating
directly on a container I always want to know what kind of container I'm
dealing with.  Cost of operations is an issue, I may need to manually sort at
some point or be aware that iterators will be invalidated, etc.

Steve has already said he doesn't use the interfaces, and I'm a huge fan of not
doing speculative design.  It's invariably wrong and then you get stuck
supporting it.  I'd vote to drop the interfaces.  They can always be added back
later anyway.

May 21 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 21 May 2010 13:42:14 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 I.e. there aren't many
 kinds of HashMaps that derive from each other.  But the interfaces
 are not detrimental to your ideas.  The only thing interfaces require
 is that the entities implementing them are classes and not structs.
 As long as you agree that classes are the right call, then interfaces
 can co-exist with your other suggestions without interference.

 This brings back a discussion I had with Walter a while ago, with echoes  
 in the newsgroup. Basically the conclusion was as follows: if a  
 container never escapes the addresses of its elements, it can manage its  
 own storage. That forces, however, the container to be a struct because  
 copying references to a class container would break that encapsulation.  
 I called those "perfectly encapsulated containers" and I think they are  
 good candidates for manual memory management because they tend to deal  
 in relatively large chunks.

 I noticed that your collections return things by value, so they are good  
 candidates for perfect encapsulation.

Then you need reference counting, I don't think that is a good design.

 Yes, if you want to define "this function needs something that is
 both addable and purgeable, I don't have an interface for that.  But
 a concept can certainly define that generically (which is what you
 want anyways), or you could just say "I need a List" and get those
 functions also.  It also does not force entities other than
 dcollections objects to be classes, they could be structs and
 implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it is
 mostly a carryover from the Java/Tango inspirations.

 I don't know Tango, but Java's containers are a terrible example to  
 follow. Java's container library is a ill-advised design on top of an  
 underpowered language, patched later with some half-understood seeming  
 of genericity. I think Java containers are a huge disservice to the  
 programming community because they foster bad design.

Java has some warts as you have rightfully pointed out in the past (i.e.  
O(n) lookup), but I have attempted to remove all those warts.   
Dcollections I would hope does not suffer from the problems that Java's  
collections suffer from.

 But I can see
 one good reason to keep them -- binary interoperability.  For
 example, it might be the case some day when D has good support with
 dynamic libraries that a library exposes some piece of itself as a
 Map or List interface.

 I need to disagree with that. I've done and I do a ton of binary  
 interoperability stuff. You never expose a generic container interface!  
 Interoperable objects always embody high-level logic that is specific to  
 the application. They might use containers inside, but they invariably  
 expose high-level, application-specific functionality.

It's done all the time in Java and .NET.  For example, A GUI listbox  
widget exposes its elements as an array of elements, which implement the  
List interface.  You don't ever see the implementation or need it.   
Granted Java and .NET have less problems than C++ and D with binary  
compatibility, since the function tables are dynamic, but the potential is  
there for D to make binary compatibility possible with interfaces.

In the past I have built a C++ library that abstracted features of the  
OS.  My goal was to make it possible to dynamically load a module that  
abstracted things like setting the IP address of a network interface.  My  
modules used std::string instead of char * to lookup services to get  
objects that implement the interface.  Big mistake.  On a later version of  
the standard C++ runtime, the private implementation of std::string  
changed, so the dynamically loaded libraries crashed horribly.  No change  
in string's interface, just the private stuff changed, but because it's a  
template, the code that uses it necessarily has to be aware of it.  We  
ended up ditching the standard C++ library's version of string, and used  
STLPort so we could control the library.

I envision this same sort of problem would be likely with D collection  
objects that were not used via interfaces.

 So my answer is -- go ahead and define these concepts and required
 names, and you can ignore the interfaces if they don't interest you.
 They do not subtract from the possibilities, and others may find good
 use for them.

 Does that make sense?

 I understand I could ignore the interfaces and call it a day, but it  
 seems that at this point we are both convinced they are not quite good  
 at anything: you only put them in because you suffered the Stockholm  
 syndrome with Java, and I hate them with a passion.

The reason I put them in is because they existed before, but thinking  
about what they would be useful for, D doesn't really support dynamic  
libraries all that well (recent advances have brought that closer).  But  
once it does, I think it will be much more important to keep compiled  
libraries compatible between versions of the standard library.  This is  
typically not possible with templates, I know C++ is horrible at it from  
personal experience.

So I'm not convinced it is not quite good at anything.  I think it's a  
solution to a problem that doesn't exist yet on D because D's linking  
features are stuck in 1995.

 Why would we keep in the standard library bad design with the advice  
 that "if you don't like it ignore it"?

People have continuously used that argument against const and immutable.   
Really that argument is fine, as long as you don't consider it bad design,  
which I don't :)

-Steve

May 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/24/2010 10:01 AM, Steven Schveighoffer wrote:
 On Fri, 21 May 2010 13:42:14 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 I.e. there aren't many kinds of HashMaps that derive from each
 other. But the interfaces are not detrimental to your ideas. The
 only thing interfaces require is that the entities implementing
 them are classes and not structs. As long as you agree that
 classes are the right call, then interfaces can co-exist with
 your other suggestions without interference.

 This brings back a discussion I had with Walter a while ago, with
 echoes in the newsgroup. Basically the conclusion was as follows:
 if a container never escapes the addresses of its elements, it can
 manage its own storage. That forces, however, the container to be a
 struct because copying references to a class container would break
 that encapsulation. I called those "perfectly encapsulated
 containers" and I think they are good candidates for manual memory
 management because they tend to deal in relatively large chunks.

 I noticed that your collections return things by value, so they are
  good candidates for perfect encapsulation.

 Then you need reference counting, I don't think that is a good
 design.

Why?

I think refcounting for containers is perfect.

Refcounting small objects is bad because the counter overhead is large.
Also, small objects tend to be created and passed around a lot, so the
time overhead is significant.

In contrast, refcounting and containers are a perfect marriage. The
container is a relatively large conglomerate of objects so refcounting
that is bound to yield good benefits in terms of memory reclamation.

 Yes, if you want to define "this function needs something that
 is both addable and purgeable, I don't have an interface for
 that. But a concept can certainly define that generically (which
 is what you want anyways), or you could just say "I need a List"
 and get those functions also. It also does not force entities
 other than dcollections objects to be classes, they could be
 structs and implement the correct concepts.

 I myself don't really use the interface aspect of the classes, it
 is mostly a carryover from the Java/Tango inspirations.

 I don't know Tango, but Java's containers are a terrible example to
  follow. Java's container library is a ill-advised design on top of
 an underpowered language, patched later with some half-understood
 seeming of genericity. I think Java containers are a huge
 disservice to the programming community because they foster bad
 design.

 Java has some warts as you have rightfully pointed out in the past
 (i.e. O(n) lookup), but I have attempted to remove all those warts.
 Dcollections I would hope does not suffer from the problems that
 Java's collections suffer from.

That's great. But let me quote what you said: "I myself don't really use
the interface aspect of the classes, it is mostly a carryover from the
Java/Tango inspirations." I took that to mean you don't have a strong
justification for structuring dcollections as a hierarchy, and
furthermore that makes me hope it's possible you'd be willing to yank
that dinosaur brain.

 But I can see one good reason to keep them -- binary
 interoperability. For example, it might be the case some day when
 D has good support with dynamic libraries that a library exposes
 some piece of itself as a Map or List interface.

 I need to disagree with that. I've done and I do a ton of binary
 interoperability stuff. You never expose a generic container
 interface! Interoperable objects always embody high-level logic
 that is specific to the application. They might use containers
 inside, but they invariably expose high-level, application-specific
 functionality.

 It's done all the time in Java and .NET. For example, A GUI listbox
 widget exposes its elements as an array of elements, which implement
 the List interface. You don't ever see the implementation or need it.
  Granted Java and .NET have less problems than C++ and D with binary
  compatibility, since the function tables are dynamic, but the
 potential is there for D to make binary compatibility possible with
 interfaces.

I see.

 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

I see no problem retrofitting a no-interface container into a formal
interface if so needed.

 So my answer is -- go ahead and define these concepts and
 required names, and you can ignore the interfaces if they don't
 interest you. They do not subtract from the possibilities, and
 others may find good use for them.

 Does that make sense?

 I understand I could ignore the interfaces and call it a day, but
 it seems that at this point we are both convinced they are not
 quite good at anything: you only put them in because you suffered
 the Stockholm syndrome with Java, and I hate them with a passion.

 The reason I put them in is because they existed before, but thinking
  about what they would be useful for, D doesn't really support
 dynamic libraries all that well (recent advances have brought that
 closer). But once it does, I think it will be much more important to
 keep compiled libraries compatible between versions of the standard
 library. This is typically not possible with templates, I know C++ is
 horrible at it from personal experience. So I'm not convinced it is
 not quite good at anything. I think it's a solution to a problem that
 doesn't exist yet on D because D's linking features are stuck in
 1995.

Clearly interfaces have their advantages. I just happen to think they
are much thinner than their disadvantages for this particular case.

 Why would we keep in the standard library bad design with the
 advice that "if you don't like it ignore it"?

 People have continuously used that argument against const and
 immutable. Really that argument is fine, as long as you don't
 consider it bad design, which I don't :)

Looks like we're heading straight to stalemate once again.

In the interest of time, it would be better to get to stalemate (or, 
hopefully, agreement) so we know whether dcollections will be integrated 
within Phobos or whether I should spend this and next weeks' evenings 
coding. To that end, please let me know whether it's worth that I spend 
time on putting together a list of proposed changes.


Andrei

May 24 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 24 May 2010 11:45:51 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 05/24/2010 10:01 AM, Steven Schveighoffer wrote:
 On Fri, 21 May 2010 13:42:14 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 05/19/2010 08:42 PM, Steven Schveighoffer wrote:
 Andrei Alexandrescu Wrote:


 I.e. there aren't many kinds of HashMaps that derive from each
 other. But the interfaces are not detrimental to your ideas. The
 only thing interfaces require is that the entities implementing
 them are classes and not structs. As long as you agree that
 classes are the right call, then interfaces can co-exist with
 your other suggestions without interference.

 This brings back a discussion I had with Walter a while ago, with
 echoes in the newsgroup. Basically the conclusion was as follows:
 if a container never escapes the addresses of its elements, it can
 manage its own storage. That forces, however, the container to be a
 struct because copying references to a class container would break
 that encapsulation. I called those "perfectly encapsulated
 containers" and I think they are good candidates for manual memory
 management because they tend to deal in relatively large chunks.

 I noticed that your collections return things by value, so they are
  good candidates for perfect encapsulation.

 Then you need reference counting, I don't think that is a good
 design.

 Why?

 I think refcounting for containers is perfect.

 Refcounting small objects is bad because the counter overhead is large.
 Also, small objects tend to be created and passed around a lot, so the
 time overhead is significant.

 In contrast, refcounting and containers are a perfect marriage. The
 container is a relatively large conglomerate of objects so refcounting
 that is bound to yield good benefits in terms of memory reclamation.

I meant refcounting elements.  If you are simply refcounting the  
container, what do you do when someone wants to remove a node?  Not allow  
it if more than one reference to the container exists?

 Java has some warts as you have rightfully pointed out in the past
 (i.e. O(n) lookup), but I have attempted to remove all those warts.
 Dcollections I would hope does not suffer from the problems that
 Java's collections suffer from.

 That's great. But let me quote what you said: "I myself don't really use
 the interface aspect of the classes, it is mostly a carryover from the
 Java/Tango inspirations." I took that to mean you don't have a strong
 justification for structuring dcollections as a hierarchy, and
 furthermore that makes me hope it's possible you'd be willing to yank
 that dinosaur brain.

I did not have a strong justification for using interfaces originally, and  
my first interface design shows that.  But I think with careful design,  
the interfaces in the D2 version are pretty useful in some situations, all  
of which could be essential to a project's design.

In other words, I don't see the fact that Java or Tango's original  
collection package had interfaces as a "wart".

The problem I see right now is, a lot of that utility is moot since D is  
not very good at dynamic linking.  Since you necessarily have to  
statically link Phobos and dcollections, it makes no sense to keep binary  
compatibility, or hide implementation.

 Looks like we're heading straight to stalemate once again.

 In the interest of time, it would be better to get to stalemate (or,  
 hopefully, agreement) so we know whether dcollections will be integrated  
 within Phobos or whether I should spend this and next weeks' evenings  
 coding. To that end, please let me know whether it's worth that I spend  
 time on putting together a list of proposed changes.

I think if we can keep dcollections as classes, then the opportunity to  
have interfaces still exists for the future.  If that's the case, we can  
ditch the interfaces for now, and revisit it when D's dynamic lib support  
gets better.

So let's move on to other issues.  I haven't changed my mind on interface  
utility, but there seems to be not much support for the idea.

-Steve

May 24 2010

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


I don't understand this discussion: isn't the reason above pretty much a 
dead-on hard requirement for the collections to have interfaces?
Something like, for example, an interface version of the range traits?

-- 
Bruno Medeiros - Software Engineer

May 26 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


 I don't understand this discussion: isn't the reason above pretty much a  
 dead-on hard requirement for the collections to have interfaces?
 Something like, for example, an interface version of the range traits?

Only if you wish to have binary compatibility with dynamic libs.  Such a  
thing isn't likely today since dynamic libs aren't very well supported in  
D, and even phobos or dcollections isn't a dynamic lib.

And I have specifically decided not to use interfaces with ranges because  
that makes them reference types.  Ranges work well as value types, but not  
well as reference types.  Therefore, to use dcollections as interfaces,  
you must not require the range traits.

-Steve

May 27 2010

Jacob Carlborg <doob me.com> writes:

On 2010-05-27 12:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


 I don't understand this discussion: isn't the reason above pretty much
 a dead-on hard requirement for the collections to have interfaces?
 Something like, for example, an interface version of the range traits?

 Only if you wish to have binary compatibility with dynamic libs. Such a
 thing isn't likely today since dynamic libs aren't very well supported
 in D, and even phobos or dcollections isn't a dynamic lib.

I've got my patch, for build Tango as a dynamic library on Mac, quite 
recently included in trunk. And I've also have a patch for druntime and 
Phobos in bugzilla just waiting to be included + one patch making it 
easier creating dynamic libraries directly with DMD. I would say it's a 
bad idea to still think that dynamic libraries aren't support, we have 
to think forward and assume they will be supported.

 And I have specifically decided not to use interfaces with ranges
 because that makes them reference types. Ranges work well as value
 types, but not well as reference types. Therefore, to use dcollections
 as interfaces, you must not require the range traits.

 -Steve


-- 
/Jacob Carlborg

May 28 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 28 May 2010 06:10:49 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-05-27 12:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


 I don't understand this discussion: isn't the reason above pretty much
 a dead-on hard requirement for the collections to have interfaces?
 Something like, for example, an interface version of the range traits?

 Only if you wish to have binary compatibility with dynamic libs. Such a
 thing isn't likely today since dynamic libs aren't very well supported
 in D, and even phobos or dcollections isn't a dynamic lib.

 I've got my patch, for build Tango as a dynamic library on Mac, quite  
 recently included in trunk. And I've also have a patch for druntime and  
 Phobos in bugzilla just waiting to be included + one patch making it  
 easier creating dynamic libraries directly with DMD. I would say it's a  
 bad idea to still think that dynamic libraries aren't support, we have  
 to think forward and assume they will be supported.

I remember that, and I'm very encouraged by it.  That being said, the  
ultimate goal is to have dmd be able to build dynamic libraries easily.  D  
has had dynamic library "support" for years, but you have to do all kinds  
of manual stuff, and the standard library isn't dynamic.  Until the  
standard library is dynamic, and I can build a dynamic library with a  
-shared equivalent switch, dynamic libs are a laboratory feature, and not  
many projects will use it.

Just so you know, I think it's important to support binary compatibility  
in dcollections, and since std.container has not adopted dcollections, I'm  
going to keep interfaces.  I was just pointing out the position others may  
have WRT binary support.

-Steve

May 28 2010

Jacob Carlborg <doob me.com> writes:

On 2010-05-28 14:12, Steven Schveighoffer wrote:
 On Fri, 28 May 2010 06:10:49 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-05-27 12:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


 I don't understand this discussion: isn't the reason above pretty much
 a dead-on hard requirement for the collections to have interfaces?
 Something like, for example, an interface version of the range traits?

 Only if you wish to have binary compatibility with dynamic libs. Such a
 thing isn't likely today since dynamic libs aren't very well supported
 in D, and even phobos or dcollections isn't a dynamic lib.

 I've got my patch, for build Tango as a dynamic library on Mac, quite
 recently included in trunk. And I've also have a patch for druntime
 and Phobos in bugzilla just waiting to be included + one patch making
 it easier creating dynamic libraries directly with DMD. I would say
 it's a bad idea to still think that dynamic libraries aren't support,
 we have to think forward and assume they will be supported.

 I remember that, and I'm very encouraged by it. That being said, the
 ultimate goal is to have dmd be able to build dynamic libraries easily.
 D has had dynamic library "support" for years, but you have to do all
 kinds of manual stuff, and the standard library isn't dynamic. Until the
 standard library is dynamic, and I can build a dynamic library with a
 -shared equivalent switch, dynamic libs are a laboratory feature, and
 not many projects will use it.

Yes, exactly, I noticed there isn't an easy way to build dynamic 
libraries, among other things you have to know the path to the standard 
library when manually building.

 Just so you know, I think it's important to support binary compatibility
 in dcollections, and since std.container has not adopted dcollections,
 I'm going to keep interfaces. I was just pointing out the position
 others may have WRT binary support.

 -Steve


-- 
/Jacob Carlborg

May 28 2010

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 27/05/2010 11:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


 I don't understand this discussion: isn't the reason above pretty much
 a dead-on hard requirement for the collections to have interfaces?
 Something like, for example, an interface version of the range traits?

 Only if you wish to have binary compatibility with dynamic libs. Such a
 thing isn't likely today since dynamic libs aren't very well supported
 in D, and even phobos or dcollections isn't a dynamic lib.

Ah, nevermind, my mind slipped and I was thinking of any kind of 
library, that is, static ones as well.
Although even just dynamic library compatibility seems to be a valid 
enough case that we should consider from the start, even if its not well 
supported currently.


-- 
Bruno Medeiros - Software Engineer

May 28 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 28 May 2010 06:24:26 -0400, Bruno Medeiros  
<brunodomedeiros+spam com.gmail> wrote:

 On 27/05/2010 11:32, Steven Schveighoffer wrote:
 On Wed, 26 May 2010 10:06:32 -0400, Bruno Medeiros
 <brunodomedeiros+spam com.gmail> wrote:

 On 24/05/2010 16:45, Andrei Alexandrescu wrote:
 In the past I have built a C++ library that abstracted features of
 the OS. My goal was to make it possible to dynamically load a module
 that abstracted things like setting the IP address of a network
 interface. My modules used std::string instead of char * to lookup
 services to get objects that implement the interface. Big mistake. On
 a later version of the standard C++ runtime, the private
 implementation of std::string changed, so the dynamically loaded
 libraries crashed horribly. No change in string's interface, just the
 private stuff changed, but because it's a template, the code that
 uses it necessarily has to be aware of it. We ended up ditching the
 standard C++ library's version of string, and used STLPort so we
 could control the library.

 I envision this same sort of problem would be likely with D
 collection objects that were not used via interfaces.

 I see no problem retrofitting a no-interface container into a formal
 interface if so needed.


 I don't understand this discussion: isn't the reason above pretty much
 a dead-on hard requirement for the collections to have interfaces?
 Something like, for example, an interface version of the range traits?

 Only if you wish to have binary compatibility with dynamic libs. Such a
 thing isn't likely today since dynamic libs aren't very well supported
 in D, and even phobos or dcollections isn't a dynamic lib.

 Ah, nevermind, my mind slipped and I was thinking of any kind of  
 library, that is, static ones as well.
 Although even just dynamic library compatibility seems to be a valid  
 enough case that we should consider from the start, even if its not well  
 supported currently.

I agree, that's why dcollections has interfaces.

I'm keeping them there since std.container has gone its own direction.

-Steve

May 28 2010

Justin Spahr-Summers <Justin.SpahrSummers gmail.com> writes:

On Mon, 24 May 2010 11:01:06 -0400, Steven Schveighoffer 
<schveiguy yahoo.com> wrote:
 It's done all the time in Java and .NET.  For example, A GUI listbox  
 widget exposes its elements as an array of elements, which implement the  
 List interface.  You don't ever see the implementation or need it.   
 Granted Java and .NET have less problems than C++ and D with binary  
 compatibility, since the function tables are dynamic, but the potential is  
 there for D to make binary compatibility possible with interfaces.
 

Cocoa (NeXT's/Apple's framework for Objective-C) uses a very successful 
and well-thought-out delegation pattern, whereby GUI elements 
representing large amounts of data, like table views, have delegates 
(not in the D sense of the word) that provide them with the actual 
contents. Granted, Objective-C's runtime is much more dynamic than D, 
but a simplified version of such a pattern could still work in D. After 
all, user interfacing is typically where dynamism is more important than 
speed.

May 24 2010

Alex Makhotin <alex bitprox.com> writes:

Steven Schveighoffer wrote:
 
 I myself don't really use the interface aspect of the classes, it is mostly a
carryover from the Java/Tango inspirations.  But I can see one good reason to
keep them -- binary interoperability.  For example, it might be the case some
day when D has good support with dynamic libraries that a library exposes some
piece of itself as a Map or List interface.
 

Hi, Steven.
You made a good point on interoperability.
Strict, precise, readable, interfaces, that's what I would like in the 
really good standard library for D.
But, is D mature enough to this kind of possibilities, considering, at 
least, known bugs involving interfaces?
I don't even feel myself free to use interfaces in my code because of 
undefined behavior it may cause.
Shared libraries...
Is this going to happen on Linux?
When?

-- 
Alex Makhotin,
the founder of BITPROX,
http://bitprox.com

May 21 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Thu, 20 May 2010 04:42:35 +0300, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 interfaces

Does that imply that the most important methods are virtual?

If so, say good-bye to inlining, and hello to an additional level of  
dereferencing.

Without meaning any disrespect to all the work you did, allow me to say  
that I won't use a library that could be faster (without making usage too  
clumsy), but isn't. (I prefer my D programs to be as fast as reasonably  
possible - if I didn't care about speed, I'd use another language.) For  
the same reasons, I'd be disappointed if the library was admitted as-is  
into Phobos, since it doesn't live up to my personal ideal of what D  
should be.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

May 21 2010

BCS <none anon.com> writes:

Hello Vladimir,

 On Thu, 20 May 2010 04:42:35 +0300, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 
 interfaces
 

 Does that imply that the most important methods are virtual?
 
 If so, say good-bye to inlining, and hello to an additional level of
 dereferencing.
 

From a technical standpoint there is no reason that a method needs to be 
called virtually from a class reference just because the same method gets 
called virtually from an interface reference.

-- 
... <IXOYE><

May 21 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 21 May 2010 22:56:54 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Thu, 20 May 2010 04:42:35 +0300, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 interfaces

 Does that imply that the most important methods are virtual?

 If so, say good-bye to inlining, and hello to an additional level of  
 dereferencing.

 Without meaning any disrespect to all the work you did, allow me to say  
 that I won't use a library that could be faster (without making usage  
 too clumsy), but isn't. (I prefer my D programs to be as fast as  
 reasonably possible - if I didn't care about speed, I'd use another  
 language.) For the same reasons, I'd be disappointed if the library was  
 admitted as-is into Phobos, since it doesn't live up to my personal  
 ideal of what D should be.

With the future update that all the classes are final, then they are not  
virtual as long as you don't use the interfaces.

-Steve

May 24 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-05-19 19:01:51 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 I wrote a solution to the problem in native D. It goes like this:
 
 alias Container!(int, addable | purgeable) Messerschmidt;
 
 void messWith(Messerschmidt i) {
      ... use i's capabilities to add and purge ...
 }

Are you sure this is necessary? I'm wondering how the above is different from:

void messWith(C)(C i) if (IsAddable!C && IsPurgeable!C) {
	... use i's capabilities to add and purge
}

where IsAddable just checks for an 'add' function and IsPurgeable 
checks for a 'purge' function. Obviously, one is a template and the 
other isn't. I'd expect the template function to be more performant 
since it doesn't require an indirection to call into the container. Is 
the need for runtime-swappable containers really common enough to 
justify adding it to the standard library? Won't adding this encourage 
people to use it without realizing the downside in performance and 
failed optimization opportunities because of the hidden dynamic 
dispatch? It's a quite nice idea, but I don't like the tradeoff.

This criticism is valid for containers implementing interfaces too. In 
my first Java programs, I was always declaring variables as the List 
interface, then instantiating an ArrayList for them, thinking it'd make 
things more generic and easier to change later. Generic sometime is 
good, but if you do that with containers in D you're in for an 
important performance drop. Personally, I'd scrap anything that's not 
made of static calls (final functions in a class are fine) so people 
can't easily make these kind of mistakes (and then believe D is slow).

Also, addable and purgeable above being or'ed constants makes the 
system difficult to scale to new concepts. The template predicates on 
the other hand are infinitely extendable: if my containers have 
'commit' and 'rollback' functions, I can define IsTransactional to 
check for the presence of the functions and make some algorithms that 
benefits from this. In fact, this can apply to anything, not just 
containers. Range are already using this pattern. Wouldn't it make 
things easier to learn if we could just reuse the same principle with 
containers?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

May 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/19/2010 09:48 PM, Michel Fortin wrote:
 On 2010-05-19 19:01:51 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 I wrote a solution to the problem in native D. It goes like this:

 alias Container!(int, addable | purgeable) Messerschmidt;

 void messWith(Messerschmidt i) {
 ... use i's capabilities to add and purge ...
 }

 Are you sure this is necessary? I'm wondering how the above is different
 from:

 void messWith(C)(C i) if (IsAddable!C && IsPurgeable!C) {
 ... use i's capabilities to add and purge
 }

It's different in that messWith is a type-parameterized function (aka a 
template in C++) with the known tradeoffs: multiple instantiations, risk 
of code bloating, but good speed most of the time. Add to that that some 
people aren't comfortable with those.

 where IsAddable just checks for an 'add' function and IsPurgeable checks
 for a 'purge' function. Obviously, one is a template and the other
 isn't. I'd expect the template function to be more performant since it
 doesn't require an indirection to call into the container. Is the need
 for runtime-swappable containers really common enough to justify adding
 it to the standard library? Won't adding this encourage people to use it
 without realizing the downside in performance and failed optimization
 opportunities because of the hidden dynamic dispatch? It's a quite nice
 idea, but I don't like the tradeoff.

These arguments are in line with mine. I tried above to convey that the 
situation is not all-win.

 This criticism is valid for containers implementing interfaces too. In
 my first Java programs, I was always declaring variables as the List
 interface, then instantiating an ArrayList for them, thinking it'd make
 things more generic and easier to change later. Generic sometime is
 good, but if you do that with containers in D you're in for an important
 performance drop. Personally, I'd scrap anything that's not made of
 static calls (final functions in a class are fine) so people can't
 easily make these kind of mistakes (and then believe D is slow).

By the way, I dislike the name ArrayList. Is it just me, or "list" is 
most often associated with "linked list" in computer lingo? So when I 
see "ArrayList" it looks like an oxymoron. Steve, could I impose on you 
to rename ArrayList simply Array?

 Also, addable and purgeable above being or'ed constants makes the system
 difficult to scale to new concepts. The template predicates on the other
 hand are infinitely extendable: if my containers have 'commit' and
 'rollback' functions, I can define IsTransactional to check for the
 presence of the functions and make some algorithms that benefits from
 this. In fact, this can apply to anything, not just containers. Range
 are already using this pattern. Wouldn't it make things easier to learn
 if we could just reuse the same principle with containers?

Nice arguments!


Andrei

May 21 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 I wrote a solution to the problem in native D. It goes like this:
 
 alias Container!(int, addable | purgeable) Messerschmidt;
 
 void messWith(Messerschmidt i) {
     ... use i's capabilities to add and purge ...
 }

I agree with Michael Fortin that the | is questionable. I'd like to suggest 
instead that it should instead be a variadic list of names, like:

     alias Container!(int, addable, purgeable) Msserschmidt;

Perhaps the names should follow a naming convention,


     alias Container!(int, ContainerAddable, ContainerPurgeable) Msserschmidt;

The problem with using scoped names, like Container.Addable, is scoped names 
cannot be added to.

May 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/21/2010 01:34 PM, Walter Bright wrote:
 Andrei Alexandrescu wrote:
 I wrote a solution to the problem in native D. It goes like this:

 alias Container!(int, addable | purgeable) Messerschmidt;

 void messWith(Messerschmidt i) {
 ... use i's capabilities to add and purge ...
 }

 I agree with Michael Fortin that the | is questionable. I'd like to
 suggest instead that it should instead be a variadic list of names, like:

 alias Container!(int, addable, purgeable) Msserschmidt;

 Perhaps the names should follow a naming convention,


 alias Container!(int, ContainerAddable, ContainerPurgeable) Msserschmidt;

 The problem with using scoped names, like Container.Addable, is scoped
 names cannot be added to.

Well the problem stays: compound interfaces grow combinatorially with 
the number of components, because an interface X!(A, B, C) inherits at 
the same time X!(A, B), X!(A, C), and X!(B, C).

Check the epic inheritance diagram here:

http://www.artima.com/cppsource/codefeatures2.html

and the equally epic table here:

http://www.artima.com/cppsource/codefeatures3.html


Andrei

May 21 2010

Bernard Helyer <b.helyer gmail.com> writes:

Oooohhh goody goody goody!   n_n


I'm in the process of making a little toy project ATM. I'll shall 
integrate dcollections 2.0a into ASAP.

May 19 2010

Bernard Helyer <b.helyer gmail.com> writes:

On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody!  n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

ArrayList doesn't compile with warnings as it overrides opEquals, but 
doesn't mark it as such.

May 19 2010

Bernard Helyer <b.helyer gmail.com> writes:

On 20/05/10 18:11, Bernard Helyer wrote:
 On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody! n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

 ArrayList doesn't compile with warnings as it overrides opEquals, but
 doesn't mark it as such.

And lines 772 and 780 complained about not being able to implicitly cast 
const(Thing) to Thing. Which is strange, because T was Thing. Inserting 
cast(Thing) seemed to 'fix' it. D=

May 19 2010

Bernard Helyer <b.helyer gmail.com> writes:

On 20/05/10 18:16, Bernard Helyer wrote:
 On 20/05/10 18:11, Bernard Helyer wrote:
 On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody! n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

 ArrayList doesn't compile with warnings as it overrides opEquals, but
 doesn't mark it as such.

 And lines 772 and 780 complained about not being able to implicitly cast
 const(Thing) to Thing. Which is strange, because T was Thing. Inserting
 cast(Thing) seemed to 'fix' it. D=

Sorry about the blow by blow, but the cursor thing seems to work well in 
my situation. Me likey dcollections very much so far. Go Steve!

May 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 20 May 2010 02:11:58 -0400, Bernard Helyer <b.helyer gmail.com>  
wrote:

 On 20/05/10 13:39, Bernard Helyer wrote:
 Oooohhh goody goody goody!  n_n


 I'm in the process of making a little toy project ATM. I'll shall
 integrate dcollections 2.0a into ASAP.

 ArrayList doesn't compile with warnings as it overrides opEquals, but  
 doesn't mark it as such.

http://www.dsource.org/projects/dcollections/ticket/5

Any other problems, please create a ticket (including your Thing one, but  
I'm not sure what you were doing there :)

-Steve

May 24 2010

Steven Schveighoffer <schveiguy yahoo.com> writes:

superdan Wrote:

 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Is there some other reason to use structs besides copy construction?
 -Steve

 memory management n shit. with a struct u can use refcounting n malloc n
realloc n
 shit. still stays a reference type. nothing gets fucked up.

 This is not necessary with purely memory-based constructs -- the GC is your

 friend.  The custom allocator ability in dcollections should provide plenty of
 freedom for memory allocation schemes.
 
 how do u set up yer custom allocator to free memory? u cant tell when its ok.
 copying refs iz under da radar. dats my point.

It frees an element's memory when the element is removed from the container. 
The container itself is managed by the GC.

 
 den there's all that null ref shit. with a class u have

 void foo(container!shit poo)
 {
     poo.addElement(Shit(diarrhea));
 }

 dat works with struct but don't work with motherfucking classes. u need to
write.

 void foo(container!shit poo)
 {
     if (!poo) poo = new container!shit; // fuck dat shit
     poo.addElement(Shit(diarrhea));
 }

 u feel me?

 It doesn't work.

 
 wut? it don't work? whaddaya mean it dun work? is you crazy? what dun work?
maybe
 therez sum misundercommunication.

void foo(int[int] x)
{
   x[5] = 5;
}

void main()
{
   int[int] x;
   foo(x);
   assert(x[5] == 5); // fails
}

-Steve

May 21 2010

Bill Baxter <wbaxter gmail.com> writes:

On Fri, May 21, 2010 at 9:43 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 superdan Wrote:

 =3D=3D Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 =3D=3D Quote from Steven Schveighoffer (schveiguy yahoo.com)'s artic=




le
 Is there some other reason to use structs besides copy constructio=





n?
 -Steve

 memory management n shit. with a struct u can use refcounting n mall=




oc n realloc n
 shit. still stays a reference type. nothing gets fucked up.

 This is not necessary with purely memory-based constructs -- the GC is=



 your
 friend. =A0The custom allocator ability in dcollections should provide p=


lenty of
 freedom for memory allocation schemes.

 how do u set up yer custom allocator to free memory? u cant tell when it=


s ok.
 copying refs iz under da radar. dats my point.

 It frees an element's memory when the element is removed from the contain=

er. =A0The container itself is managed by the GC.
 den there's all that null ref shit. with a class u have

 void foo(container!shit poo)
 {
 =A0 =A0 poo.addElement(Shit(diarrhea));
 }

 dat works with struct but don't work with motherfucking classes. u n=




eed to write.
 void foo(container!shit poo)
 {
 =A0 =A0 if (!poo) poo =3D new container!shit; // fuck dat shit
 =A0 =A0 poo.addElement(Shit(diarrhea));
 }

 u feel me?

 It doesn't work.

 wut? it don't work? whaddaya mean it dun work? is you crazy? what dun wo=


rk? maybe
 therez sum misundercommunication.

 void foo(int[int] x)
 {
 =A0 x[5] =3D 5;
 }

 void main()
 {
 =A0 int[int] x;
 =A0 foo(x);
 =A0 assert(x[5] =3D=3D 5); // fails
 }

And with arrays at least it's even more insidious, because sometimes
it will seem to work, and sometimes it won't.
void foo(int[] x) {
    x ~=3D 10;
}
Caller's .length will never get updated by that, but it won't crash so
it may take a while to find the bug.

Very easy bug to get caught by in D.  I'm pretty sure that one's
zapped me three or four times at least.  Probably because I started
thinking I wasn't going to modify the length of an array in a
particular function, then later I decide to (or in some function that
function calls).

--bb

May 21 2010

BCS <none anon.com> writes:

Hello Bill,

 On Fri, May 21, 2010 at 9:43 AM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:
 
 void foo(int[int] x)
 {
 x[5] = 5;
 }
 void main()
 {
 int[int] x;
 foo(x);
 assert(x[5] == 5); // fails
 }

 And with arrays at least it's even more insidious, because sometimes
 it will seem to work, and sometimes it won't.
 void foo(int[] x) {
 x ~= 10;
 }
 Caller's .length will never get updated by that, but it won't crash so
 it may take a while to find the bug.

Maybe the style rule should be: dynamic arrays and AA's should be passed 
as const or ref.  

-- 
... <IXOYE><

May 21 2010

bearophile <bearophileHUGS lycos.com> writes:

BCS:
 Maybe the style rule should be: dynamic arrays and AA's should be passed 
 as const or ref.  

Something like that must be enforced by the compiler, or the design of
arrays/AAs must be changed.

Bye,
bearophile

May 22 2010

Pelle <pelle.mansson gmail.com> writes:

On 05/22/2010 12:20 PM, bearophile wrote:
 BCS:
 Maybe the style rule should be: dynamic arrays and AA's should be passed
 as const or ref.

 Something like that must be enforced by the compiler, or the design of
arrays/AAs must be changed.

 Bye,
 bearophile

It's not a very good rule anyway:

void inc_all(int[] xs) {
     foreach (ref x; xs) {
         x += 1;
     }
}

Wouldn't gain anything from ref, and wouldn't work with const.

May 22 2010

bearophile <bearophileHUGS lycos.com> writes:

Pelle:

 It's not a very good rule anyway:
 
 void inc_all(int[] xs) {
      foreach (ref x; xs) {
          x += 1;
      }
 }
 
 Wouldn't gain anything from ref, and wouldn't work with const.

You are wrong, it works correctly with ref:

import std.stdio: writeln;

void inc_all(ref int[] array) {
    foreach (ref x; array)
         x++;
}

void main() {
    auto a = new int[10];
    a.inc_all();
    writeln(a);
}


It prints 1 1 1 1 1 1 1 1 1 1

Bye,
bearophile

May 22 2010

Pelle <pelle.mansson gmail.com> writes:

On 05/22/2010 01:00 PM, bearophile wrote:
 Pelle:
 Wouldn't gain anything from ref, and wouldn't work with const.

 You are wrong, it works correctly with ref:

Yes, it works, but it doesn't gain anything from it, which is what I 
said. :)

May 22 2010

bearophile <bearophileHUGS lycos.com> writes:

Pelle:
 Yes, it works, but it doesn't gain anything from it, which is what I 
 said. :)

Then what you have said was useless.

Bye,
bearophile

May 22 2010

Pelle <pelle.mansson gmail.com> writes:

On 05/22/2010 01:28 PM, bearophile wrote:
 Pelle:
 Yes, it works, but it doesn't gain anything from it, which is what I
 said. :)

 Then what you have said was useless.

 Bye,
 bearophile

How so? Passing by reference is slower, and sometimes completely 
meaningless. Therefore, having a rule that requires passing by ref is 
not a good.

Also:

int[] get_xs() { return new int[](100); }

void main() {
     get_xs.inc_all;

     get_xs.inc_all_by_ref; // Error: get_xs() is not an lvalue
}

May 22 2010

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

On 05/22/2010 06:28 AM, bearophile wrote:
 Pelle:
 Yes, it works, but it doesn't gain anything from it, which is what I
 said. :)

 Then what you have said was useless.

No it isn't. The point Pelle was making is that there are three use 
cases for parameter passed arrays:

1. read-only (corresponds to const)
2. read/write + resizing/reallocation (corresponds to ref)
3. read/write, no resizing (corresponds to no modifier)

With (1) and (2), if I see those modifiers in the signature, I can 
deduce the use of the array.

With (3), I can't really tell from the signature if the programmer meant 
that use case or forgot something, but the case is valuable enough that 
it should be enforceable.

With associative arrays, case (3) is rare (is it even possible?) to the 
point that it can drop out of consideration.

May 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

I'll now slowly answer the great replies I got in this thread, in 
chronological order. Since I read them all before replying, I might 
sometimes refer to future posts. Hopefully that won't be too confusing.

Andrei

May 21 2010

BLS <windevguy hotmail.de> writes:

On 21/05/2010 19:15, Andrei Alexandrescu wrote:
 I'll now slowly answer the great replies I got in this thread, in
 chronological order. Since I read them all before replying, I might
 sometimes refer to future posts. Hopefully that won't be too confusing.

 Andrei

Following the dcollection thread, it seems that your argument is that 
Interfaces in a non hierarchical collection library are useless.
Or : Hierarchical collection libs are nonsense and ergo interfaces are 
obsolete.

ok Some collection types belong to the same family. So they should be 
loosely coupled. How to do that without having Interfaces ?
Next  f.i. forwardranges belongs to a certain family of collections. and 
Ranges are (or should be) Interfaces.
You see me very confused.

Bjoern
given
struct node
class col : IFWRange
node Node;
// No ?

May 21 2010

BCS <none anon.com> writes:

Hello BLS,


 ok Some collection types belong to the same family. So they should be
 loosely coupled. How to do that without having Interfaces ?

How about this: Make a flat family of collections. Name the methods so that 
across the lib, the same name does the same kind of thing and different names 
do different kinds of things. Then define a flat family of whatever interfaces 
are useful and have each class derive from all the interfaces they happen 
to implement (but don't go out of your way to make anything fit).

-- 
... <IXOYE><

May 21 2010

superdan <super dan.org> writes:

== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 superdan Wrote:
 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 Is there some other reason to use structs besides copy construction?
 -Steve

 memory management n shit. with a struct u can use refcounting n malloc n




realloc n
 shit. still stays a reference type. nothing gets fucked up.

 This is not necessary with purely memory-based constructs -- the GC is your

 friend.  The custom allocator ability in dcollections should provide plenty of
 freedom for memory allocation schemes.

 how do u set up yer custom allocator to free memory? u cant tell when its ok.
 copying refs iz under da radar. dats my point.

 It frees an element's memory when the element is removed from the container.

The container itself is managed by the GC.
 den there's all that null ref shit. with a class u have

 void foo(container!shit poo)
 {
     poo.addElement(Shit(diarrhea));
 }

 dat works with struct but don't work with motherfucking classes. u need to




write.
 void foo(container!shit poo)
 {
     if (!poo) poo = new container!shit; // fuck dat shit
     poo.addElement(Shit(diarrhea));
 }

 u feel me?

 It doesn't work.

 wut? it don't work? whaddaya mean it dun work? is you crazy? what dun work?
maybe
 therez sum misundercommunication.

 void foo(int[int] x)
 {
    x[5] = 5;
 }
 void main()
 {
    int[int] x;
    foo(x);
    assert(x[5] == 5); // fails
 }
 -Steve

wrote a long post but it got lost. shit. bottom line dats a bug in dmd or
phobos.

May 21 2010

Bill Baxter <wbaxter gmail.com> writes:

On Fri, May 21, 2010 at 10:50 AM, superdan <super dan.org> wrote:

 void foo(int[int] x)
 {
 =A0 =A0x[5] =3D 5;
 }
 void main()
 {
 =A0 =A0int[int] x;
 =A0 =A0foo(x);
 =A0 =A0assert(x[5] =3D=3D 5); // fails
 }
 -Steve

 wrote a long post but it got lost. shit. bottom line dats a bug in dmd or=

 phobos.

Unfortunately it works exactly as designed.

--bb

May 21 2010

BLS <windevguy hotmail.de> writes:

Fantastic work Steve,
Pretty good design IMHO, not sure why .. but somehow the collection 
battle reminds me to

Steve Vai vs Ry Cooder
<vbg>
Bjoern

May 23 2010

D Programming

C/C++ Programming

Other

digitalmars.D.announce - dcollections 1.0 and 2.0a beta released