digitalmars.D.announce - dcollections 0.01 release

Steven Schveighoffer (28/28) May 05 2008 I've been tinkering with a collection package that is a hybrid between C...

Some guy (2/38) May 05 2008 Just wondering why people (especially from the U.S.) always build red-bl...

Spacen Jasset (3/43) May 06 2008 It mentions in that very wiki article that AVL trees may be slower at
Steven Schveighoffer (6/10) May 06 2008 Maybe because I've never heard of them :) Although, I probably did in m...
Robert Fraser (10/50) May 07 2008 Well, according to my data structures prof...

torhu (2/5) May 06 2008 Interesting stuff. Any plans for adding sorting?

Steven Schveighoffer (17/22) May 06 2008 I wasn't planning on it...

BCS (7/41) May 06 2008 One thing that might make this easier would be a collection of collectio...

janderson (5/53) May 07 2008 If I understand you correctly, that could also be useful if you want to

torhu (5/23) May 06 2008 I was mostly thinking about a sortable linked list. I guess it can be

Charles D Hixson (4/40) May 06 2008 Nice! Have you considered adding a B+Tree (or a B*Tree).

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

I've been tinkering with a collection package that is a hybrid between C++, 
Java, and Tango, utilizing the best D features (such as slicing, foreach, 
etc.).

The result is dcollections.  Here is a list of the features:

    * Hash, RBTree, Link, and Array implementations for appropriate 
containers.
    * List, Set, Map, and Multiset containers provided.
    * Able to swap out underlying implementation of a container, or 
customize implementation.
    * Minimized heap activity. All cursors are struct-based.
    * Should be compatible with both Tango and Phobos (tested with Tango).
    * Slicing where appropriate (currently only ArrayList, but will add to 
other containers).
    * Removal while traversing.
    * Removal of elements does not invalidate cursors where possible.
    * Cursors can be kept for later use (such as O(1) removal if supported 
by the container).
    * Interfaces for implementation-independent code.
    * Concatenation and appending for lists.
    * dup functions.
    * Set/Map intersection.
    * Handy filter, transform, and chain iterators.

There's a lot left to be done, especially on the documentation and testing 
side, so don't expect everything to be properly documented or actually work 
:)  But I think it's at a point where it can be useful.

Enjoy!

http://www.dsource.org/projects/dcollections

-Steve

May 05 2008

Some guy <no-email quit-bugging-me.com> writes:

Steven Schveighoffer Wrote:

 I've been tinkering with a collection package that is a hybrid between C++, 
 Java, and Tango, utilizing the best D features (such as slicing, foreach, 
 etc.).
 
 The result is dcollections.  Here is a list of the features:
 
     * Hash, RBTree, Link, and Array implementations for appropriate 
 containers.
     * List, Set, Map, and Multiset containers provided.
     * Able to swap out underlying implementation of a container, or 
 customize implementation.
     * Minimized heap activity. All cursors are struct-based.
     * Should be compatible with both Tango and Phobos (tested with Tango).
     * Slicing where appropriate (currently only ArrayList, but will add to 
 other containers).
     * Removal while traversing.
     * Removal of elements does not invalidate cursors where possible.
     * Cursors can be kept for later use (such as O(1) removal if supported 
 by the container).
     * Interfaces for implementation-independent code.
     * Concatenation and appending for lists.
     * dup functions.
     * Set/Map intersection.
     * Handy filter, transform, and chain iterators.
 
 There's a lot left to be done, especially on the documentation and testing 
 side, so don't expect everything to be properly documented or actually work 
 :)  But I think it's at a point where it can be useful.
 
 Enjoy!
 
 http://www.dsource.org/projects/dcollections
 
 -Steve 
 
 

Just wondering why people (especially from the U.S.) always build red-black
trees instead of AVL trees ( http://en.wikipedia.org/wiki/Avl_tree ) which are
faster, simpler to understand, simpler to implement and support the same set of
operations?

May 05 2008

Spacen Jasset <spacenjasset yahoo.co.uk> writes:

Some guy wrote:
 Steven Schveighoffer Wrote:
 
 I've been tinkering with a collection package that is a hybrid between C++, 
 Java, and Tango, utilizing the best D features (such as slicing, foreach, 
 etc.).

 The result is dcollections.  Here is a list of the features:

     * Hash, RBTree, Link, and Array implementations for appropriate 
 containers.
     * List, Set, Map, and Multiset containers provided.
     * Able to swap out underlying implementation of a container, or 
 customize implementation.
     * Minimized heap activity. All cursors are struct-based.
     * Should be compatible with both Tango and Phobos (tested with Tango).
     * Slicing where appropriate (currently only ArrayList, but will add to 
 other containers).
     * Removal while traversing.
     * Removal of elements does not invalidate cursors where possible.
     * Cursors can be kept for later use (such as O(1) removal if supported 
 by the container).
     * Interfaces for implementation-independent code.
     * Concatenation and appending for lists.
     * dup functions.
     * Set/Map intersection.
     * Handy filter, transform, and chain iterators.

 There's a lot left to be done, especially on the documentation and testing 
 side, so don't expect everything to be properly documented or actually work 
 :)  But I think it's at a point where it can be useful.

 Enjoy!

 http://www.dsource.org/projects/dcollections

 -Steve 

 
 Just wondering why people (especially from the U.S.) always build red-black
trees instead of AVL trees ( http://en.wikipedia.org/wiki/Avl_tree ) which are
faster, simpler to understand, simpler to implement and support the same set of
operations?

It mentions in that very wiki article that AVL trees may be slower at 
insertion than red-black trees.

May 06 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Some guy" wrote

 Just wondering why people (especially from the U.S.) always build 
 red-black trees instead of AVL trees ( 
 http://en.wikipedia.org/wiki/Avl_tree ) which are faster, simpler to 
 understand, simpler to implement and support the same set of operations?

Maybe because I've never heard of them :)  Although, I probably did in my 
algorithms class, I just don't remember paying a lot of attention there...

No matter though, the TreeMap collection's implementation can be easily 
swapped out for an AVL version.  Thanks for the link!

-Steve

May 06 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Some guy wrote:
 Steven Schveighoffer Wrote:
 
 I've been tinkering with a collection package that is a hybrid between C++, 
 Java, and Tango, utilizing the best D features (such as slicing, foreach, 
 etc.).

 The result is dcollections.  Here is a list of the features:

     * Hash, RBTree, Link, and Array implementations for appropriate 
 containers.
     * List, Set, Map, and Multiset containers provided.
     * Able to swap out underlying implementation of a container, or 
 customize implementation.
     * Minimized heap activity. All cursors are struct-based.
     * Should be compatible with both Tango and Phobos (tested with Tango).
     * Slicing where appropriate (currently only ArrayList, but will add to 
 other containers).
     * Removal while traversing.
     * Removal of elements does not invalidate cursors where possible.
     * Cursors can be kept for later use (such as O(1) removal if supported 
 by the container).
     * Interfaces for implementation-independent code.
     * Concatenation and appending for lists.
     * dup functions.
     * Set/Map intersection.
     * Handy filter, transform, and chain iterators.

 There's a lot left to be done, especially on the documentation and testing 
 side, so don't expect everything to be properly documented or actually work 
 :)  But I think it's at a point where it can be useful.

 Enjoy!

 http://www.dsource.org/projects/dcollections

 -Steve 

 
 Just wondering why people (especially from the U.S.) always build red-black
trees instead of AVL trees ( http://en.wikipedia.org/wiki/Avl_tree ) which are
faster, simpler to understand, simpler to implement and support the same set of
operations?

Well, according to my data structures prof...

- For data inserted perfectly randomly, plain BSTs work best
- For data inserted mostly randomly but with occasional ordering, 
red-black trees work best. This is why Java's collection framework uses 
them, since for unknown user data they're probably the best choice
- For data inserted in a mostly-ordered fashion, AVL trees work best
- For data retrieved in an ordered or clumped fashion, Splay trees are best
- For data that won't fit entirely in memory, B+trees are best (but 
that's sort of a special niche)

May 07 2008

torhu <no spam.invalid> writes:

Steven Schveighoffer wrote:
 I've been tinkering with a collection package that is a hybrid between C++, 
 Java, and Tango, utilizing the best D features (such as slicing, foreach, 
 etc.).

Interesting stuff.  Any plans for adding sorting?

May 06 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"torhu" wrote
 Steven Schveighoffer wrote:
 I've been tinkering with a collection package that is a hybrid between 
 C++, Java, and Tango, utilizing the best D features (such as slicing, 
 foreach, etc.).

 Interesting stuff.  Any plans for adding sorting?

I wasn't planning on it...

I think sorting really only makes sense in cases where quick lookup is 
possible.  The implementations I support are:

RBTree -> already sorted
Hash -> can't be sorted
Link -> don't have quick lookup.
Array -> possible to sort using the built-in array sort:

ArrayList!(int) x([5,4,3,2,1]);

x.asArray.sort;

Basically, for ArrayList, any possible sort routines that are written for 
builtin arrays can be applied.

For unsortable implementations, they can be 'sorted' by adding them to a 
Tree implementation, which will sort on insertion.  Or if you like, turned 
into an array, and sorted there.

Is there another requirement I'm missing?

-Steve

May 06 2008

BCS <ao pathlink.com> writes:

Reply to Steven,

 "torhu" wrote
 
 Steven Schveighoffer wrote:
 
 I've been tinkering with a collection package that is a hybrid
 between C++, Java, and Tango, utilizing the best D features (such as
 slicing, foreach, etc.).
 

 Interesting stuff.  Any plans for adding sorting?
 

 I wasn't planning on it...
 
 I think sorting really only makes sense in cases where quick lookup is
 possible.  The implementations I support are:
 
 RBTree -> already sorted
 Hash -> can't be sorted
 Link -> don't have quick lookup.
 Array -> possible to sort using the built-in array sort:
 ArrayList!(int) x([5,4,3,2,1]);
 
 x.asArray.sort;
 
 Basically, for ArrayList, any possible sort routines that are written
 for builtin arrays can be applied.
 
 For unsortable implementations, they can be 'sorted' by adding them to
 a Tree implementation, which will sort on insertion.  Or if you like,
 turned into an array, and sorted there.
 
 Is there another requirement I'm missing?
 
 -Steve
 

One thing that might make this easier would be a collection of collections 
type:

Many!(T, RBTree, Array) foo

foo would have the API for both RBTree and Array but adding/removing to/from 
one would also do the same to the other. This might be more useful with
something 
like several RBTrees where each is sorted differently.

May 06 2008

janderson <askme me.com> writes:

BCS wrote:
 Reply to Steven,
 
 "torhu" wrote

 Steven Schveighoffer wrote:

 I've been tinkering with a collection package that is a hybrid
 between C++, Java, and Tango, utilizing the best D features (such as
 slicing, foreach, etc.).

 Interesting stuff.  Any plans for adding sorting?

 I wasn't planning on it...

 I think sorting really only makes sense in cases where quick lookup is
 possible.  The implementations I support are:

 RBTree -> already sorted
 Hash -> can't be sorted
 Link -> don't have quick lookup.
 Array -> possible to sort using the built-in array sort:
 ArrayList!(int) x([5,4,3,2,1]);

 x.asArray.sort;

 Basically, for ArrayList, any possible sort routines that are written
 for builtin arrays can be applied.

 For unsortable implementations, they can be 'sorted' by adding them to
 a Tree implementation, which will sort on insertion.  Or if you like,
 turned into an array, and sorted there.

 Is there another requirement I'm missing?

 -Steve

 
 One thing that might make this easier would be a collection of 
 collections type:
 
 Many!(T, RBTree, Array) foo
 
 foo would have the API for both RBTree and Array but adding/removing 
 to/from one would also do the same to the other. This might be more 
 useful with something like several RBTrees where each is sorted 
 differently.
 
 

If I understand you correctly, that could also be useful if you want to 
iterate quickly though a list, and also find something quickly other 
times.  Particularly if you have a hash and a array put together.

-Joel

May 07 2008

torhu <no spam.invalid> writes:

Steven Schveighoffer wrote:
 "torhu" wrote
 Steven Schveighoffer wrote:
 I've been tinkering with a collection package that is a hybrid between 
 C++, Java, and Tango, utilizing the best D features (such as slicing, 
 foreach, etc.).

 Interesting stuff.  Any plans for adding sorting?

 
 I wasn't planning on it...
 
 I think sorting really only makes sense in cases where quick lookup is 
 possible.  The implementations I support are:
 
 RBTree -> already sorted
 Hash -> can't be sorted
 Link -> don't have quick lookup.
 Array -> possible to sort using the built-in array sort:
 

I was mostly thinking about a sortable linked list.  I guess it can be 
done by copying the contents into another container, like you suggest. 
How fast that would be would probably depend on how your linked list is 
implemented, how much needs to be copied before sorting can be done.

May 06 2008

Charles D Hixson <charleshixsn earthlink.net> writes:

Steven Schveighoffer wrote:
 I've been tinkering with a collection package that is a hybrid between C++, 
 Java, and Tango, utilizing the best D features (such as slicing, foreach, 
 etc.).
 
 The result is dcollections.  Here is a list of the features:
 
     * Hash, RBTree, Link, and Array implementations for appropriate 
 containers.
     * List, Set, Map, and Multiset containers provided.
     * Able to swap out underlying implementation of a container, or 
 customize implementation.
     * Minimized heap activity. All cursors are struct-based.
     * Should be compatible with both Tango and Phobos (tested with Tango).
     * Slicing where appropriate (currently only ArrayList, but will add to 
 other containers).
     * Removal while traversing.
     * Removal of elements does not invalidate cursors where possible.
     * Cursors can be kept for later use (such as O(1) removal if supported 
 by the container).
     * Interfaces for implementation-independent code.
     * Concatenation and appending for lists.
     * dup functions.
     * Set/Map intersection.
     * Handy filter, transform, and chain iterators.
 
 There's a lot left to be done, especially on the documentation and testing 
 side, so don't expect everything to be properly documented or actually work 
 :)  But I think it's at a point where it can be useful.
 
 Enjoy!
 
 http://www.dsource.org/projects/dcollections
 
 -Steve 
 
 

Nice!  Have you considered adding a B+Tree (or a B*Tree). 
That would require a backing file store...but it could be 
quite useful.

May 06 2008

D Programming

C/C++ Programming

Other

digitalmars.D.announce - dcollections 0.01 release