www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - A set type implemented as an AA wrapper

reply mark <mark qtrac.eu> writes:
I use sets a lot and since I believe that D's rbtree is O(lg n) 
for add/remove/in and that D's AA is O(1) for these, I want to 
implement a set in terms of an AA.

Below is the code I've got so far. It allows for add and remove. 
However, it has three problems (that I know of):

XXX: I need to use an if on the struct to restrict T to be a type 
that supports toHash and opEquals (i.e., to be a valid AA key)

YYY: The range() method is clearly not good D style but I don't 
know how to support foreach (item; aaset) ...

ZZZ: I can't figure out how to support the in operator.

// XXX how do I write the if to ensure T is a valid AA key that 
suppors
// toHash & opEquals?
struct AAset(T) {
     private {
         alias Unit = void[0];
         enum unit = Unit.init;
         Unit[T] set;
     }
     size_t length() const { return set.length; }
     void add(T item) { set[item] = unit; }
     bool remove(T item) { return set.remove(item); }

     // YYY is there a better way so that people can just use
     //  foreach (var; aaset)
     auto range() { return set.byKey; }

     bool opBinaryRight(string op)(T lhs) { // ZZZ doesn't work
         static if (op == "in") return lhs in set;
         else static assert(0, "operator " ~ op ~ " not 
supported");
     }
     // TODO union(), intersection(), difference(), 
symmetric_difference()
}

unittest {
     import std.algorithm: sort;
     import std.array: array;
     import std.range: enumerate;
     import std.stdio: writeln;
     import std.typecons: Tuple;

     alias Pair = Tuple!(int, "count", string, "word");
     auto inputs = [Pair(1, "one"), Pair(2, "two"), Pair(3, 
"three"),
                    Pair(4, "four"), Pair(4, "two"), Pair(5, 
"five"),
                    Pair(6, "six")];
     AAset!string words;
     assert(words.length == 0);
     foreach (pair; inputs) {
         words.add(pair.word);
         assert(words.length == pair.count);
     }
     auto len = words.length;
     assert(!words.remove("missing"));
     assert(words.remove("one"));
     assert(words.length == len - 1);
     auto expected = ["five", "four", "six", "three", "two"];
     foreach (i, word; words.range.array.sort.enumerate)
         assert(word == expected[i]);
     /*
     assert("Z" !in words);
     assert("three" !in words);
     */
}
Mar 12 2020
next sibling parent Ferhat =?UTF-8?B?S3VydHVsbXXFnw==?= <aferust gmail.com> writes:
On Thursday, 12 March 2020 at 08:51:24 UTC, mark wrote:
 I use sets a lot and since I believe that D's rbtree is O(lg n) 
 for add/remove/in and that D's AA is O(1) for these, I want to 
 implement a set in terms of an AA.
 XXX: I need to use an if on the struct to restrict T to be a 
 type that supports toHash and opEquals (i.e., to be a valid AA 
 key)
Maybe there is another way for this. But I would consider using: std.traits.hasMember
 YYY: The range() method is clearly not good D style but I don't
opApply can be used for it.
 ZZZ: I can't figure out how to support the in operator.
V* opBinaryRight(string op) with in should return a pointer for valid entries and null for non-existing entries. Thus, "if(key in AA){...}" works as expected. That is how AA implementation of druntime works. Recently I ve ported it for -betterC. You can take a look at: https://github.com/aferust/bcaa
Mar 12 2020
prev sibling next sibling parent reply Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Thursday, 12 March 2020 at 08:51:24 UTC, mark wrote:
 I use sets a lot and since I believe that D's rbtree is O(lg n) 
 for add/remove/in and that D's AA is O(1) for these, I want to 
 implement a set in terms of an AA.

 Below is the code I've got so far. It allows for add and 
 remove. However, it has three problems (that I know of):

 XXX: I need to use an if on the struct to restrict T to be a 
 type that supports toHash and opEquals (i.e., to be a valid AA 
 key)
I'd suggest simply testing if an AA with that key type is valid: struct AAset(T) if (is(int[T]))
 YYY: The range() method is clearly not good D style but I don't 
 know how to support foreach (item; aaset) ...
As Ferhat points out, you could use opApply for this. There's also the option of implenting the range primitives front, popFront() and empty. However, the easiest solution is the one you've already chosen, combined with alias this: struct AAset(T) if (is(int[T])) { // stuffs... auto range() { return set.byKey; } alias range this; }
 ZZZ: I can't figure out how to support the in operator.
'in' for AAs returns a pointer to the value it finds, or null if no item is found. This pointer does not implicitly convert to bool when returned from a function, so you need to compare it to null: bool opBinaryRight(string op)(T lhs) { static if (op == "in") return (lhs in set) != null; else static assert(0, "operator " ~ op ~ " not supported"); } I would also suggest using a template specialization instead of static if and static assert: bool opBinaryRight(string op : "in")(T lhs) { return (lhs in set) != null; } -- Simen
Mar 12 2020
parent reply mark <mark qtrac.eu> writes:
On Thursday, 12 March 2020 at 11:33:25 UTC, Simen Kjærås wrote:
[snip]
 I'd suggest simply testing if an AA with that key type is valid:

     struct AAset(T) if (is(int[T]))
That's very subtle, but it works.
 As Ferhat points out, you could use opApply for this. There's 
 also the option of implenting the range primitives front, 
 popFront() and empty. However, the easiest solution is the one 
 you've already chosen, combined with alias this:

     struct AAset(T) if (is(int[T])) {
         // stuffs...
         auto range() { return set.byKey; }
         alias range this;
     }
Again, subtle, and again it works!
 I would also suggest using a template specialization instead of 
 static if and static assert:

     bool opBinaryRight(string op : "in")(T lhs) {
         return (lhs in set) != null;
     }
Thanks, you've solved all the issued I had! Here's the complete revised code: struct AAset(T) if (is(int[T])) { private { alias Unit = void[0]; enum unit = Unit.init; Unit[T] set; } size_t length() const { return set.length; } void add(T item) { set[item] = unit; } bool remove(T item) { return set.remove(item); } auto range() { return set.byKey; } alias range this; bool opBinaryRight(string op: "in")(T lhs) { return (lhs in set) != null; } } unittest { import std.algorithm: sort; import std.array: array; import std.range: enumerate; import std.stdio: writeln; import std.typecons: Tuple; writeln("unittest for the aaset library."); alias Pair = Tuple!(int, "count", string, "word"); immutable inputs = [Pair(1, "one"), Pair(2, "two"), Pair(3, "three"), Pair(4, "four"), Pair(4, "two"), Pair(5, "five"), Pair(6, "six")]; AAset!string words; assert(words.length == 0); foreach (pair; inputs) { words.add(pair.word); assert(words.length == pair.count); } immutable len = words.length; assert(!words.remove("missing")); assert(words.remove("one")); assert(words.length == len - 1); immutable expected = ["five", "four", "six", "three", "two"]; foreach (i, word; words.array.sort.enumerate) assert(word == expected[i]); assert("Z" !in words); assert("three" in words); }
Mar 12 2020
parent mark <mark qtrac.eu> writes:
Just in case anyone would like to use it, I've put it on github 
and also added it as a dub package.
https://code.dlang.org/packages/aaset
Mar 12 2020
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Mar 12, 2020 at 08:51:24AM +0000, mark via Digitalmars-d-learn wrote:
[...]
 YYY: The range() method is clearly not good D style but I don't know
 how to support foreach (item; aaset) ...
The usual idiom is to overload .opSlice, then you can do: foreach (item; aaset[]) { ... } IOW rename .range to .opSlice.
 ZZZ: I can't figure out how to support the in operator.
Note that 'x in aa' returns a pointer, not a bool. You could try: bool opBinaryRight(string op : "in")(T lhs) { return (lhs in set) !is null; } T -- The richest man is not he who has the most, but he who needs the least.
Mar 12 2020
parent mark <mark qtrac.eu> writes:
On Thursday, 12 March 2020 at 16:02:14 UTC, H. S. Teoh wrote:
 On Thu, Mar 12, 2020 at 08:51:24AM +0000, mark via 
 Digitalmars-d-learn wrote: [...]
 YYY: The range() method is clearly not good D style but I 
 don't know
 how to support foreach (item; aaset) ...
The usual idiom is to overload .opSlice, then you can do: foreach (item; aaset[]) { ... } IOW rename .range to .opSlice.
 ZZZ: I can't figure out how to support the in operator.
Note that 'x in aa' returns a pointer, not a bool. You could try: bool opBinaryRight(string op : "in")(T lhs) { return (lhs in set) !is null; }
I renamed .range to .opSlice (and deleted the alias range this) and foreach(item; aaset) works fine. As for the != null, that was a mistake (due to Java) which I've now fixed. I'm hoping to add union & intersection soon too. Thanks.
Mar 12 2020