digitalmars.D.learn - How is opEquals used in toHash

PinDPlugga (34/34) May 18 2021 In the solution to one of the exercises in Programming in D the

Simen =?UTF-8?B?S2rDpnLDpXM=?= (78/84) May 18 2021 opEquals plays no part in how toHash does its thing, but it's

PinDPlugga <a a.com> writes:

In the solution to one of the exercises in Programming in D the 
unittests fail with respect to the toHash implementation.

Here is a link to the full solution provided:
https://run.dlang.io/gist/99ddf791f86aaa9d333d032166aadcb9?args=-unittest%20-main

and the link to the relevant section in the book:
https://ddili.org/ders/d.en/object.html

so while the unittest for equality passes, the next test with 
`in` is where the failure occurs:
```D
// ...
     assert(area1 == area2);    // Passes

     double[TriangularArea] areas;
     areas[area1] = 1.25;

     assert(area2 in areas); // Fails
// ...
```

the issue must be with the toHash function given
``` D
     override size_t toHash() const {
         /* Since the 'points' member is an array, we can take
          * advantage of the existing toHash algorithm for
          * array types. */
         return typeid(points).getHash(&points);
     }
```

I looked into the object library documentation and saw that the 
template getHash calls hashOf which calls 
core.internal.hash.hashOf etc.

But what I do not understand is why opEquals is necessary and 
where in the implementation of toHash it plays its role? Since 
area1 and area2 have different static arrays of Points I 
understand why `typeid(points).getHash(&points)` would produce 
different values for each, but it seems like the opEquals should 
play a part in making them produce the same hash.

May 18 2021

Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:

On Tuesday, 18 May 2021 at 10:14:26 UTC, PinDPlugga wrote:
 But what I do not understand is why opEquals is necessary and 
 where in the implementation of toHash it plays its role? Since 
 area1 and area2 have different static arrays of Points I 
 understand why `typeid(points).getHash(&points)` would produce 
 different values for each, but it seems like the opEquals 
 should play a part in making them produce the same hash.

opEquals plays no part in how toHash does its thing, but it's 
important for how AAs work. When there's a hash collision, the 
colliding items are placed in a bucket, which is generally 
iterated linearly to find the matching element, using opEquals. 
Example code:

     ```d
     struct S {
         int i;
         size_t toHash() const nothrow  safe { return 3; }
         bool opEquals(ref const S rhs) const { return i == rhs.i; 
}
     }

     unittest {
         int[S] a;
         a[S(1)] = 3;
         a[S(2)] = 4; // Hash collision - both return 3
         assert(a.length == 2); // But they're both there
         assert(S(1) in a);
         assert(S(2) in a);
         int b = a[S(1)];
         int c = a[S(2)];
         assert(b == 3);
         assert(c == 4);
     }
     ```

And pseudocode for how an AA works:

     ```d
     struct AA(K, V) {
         import std.typecons : Tuple;
         alias Pair = Tuple!(K, "key", V, "value");
         alias Bucket = Pair[];

         Bucket[] buckets;

         this(int bucketCount) {
             buckets.length = bucketCount;
         }

         void opIndexAssign(V value, K key) {
             // Use hash to find correct bucket
             size_t hash = key.toHash();
             size_t index = hash % buckets.length;
             Bucket bucket = buckets[index];

             foreach (e; bucket) {
                 if (e.key == key) { // Check for duplicates with 
opEquals
                      e.value = value; // Overwrite existing
                      return;
                 }
             }

             bucket ~= Pair(key, value);
             // Array could reallocate when appended to,
             // so assign it back to the bucket list
             buckets[index] = bucket;
         }

         V opIndex(K key) {
             // Use hash to find correct bucket
             size_t hash = key.toHash();
             size_t index = hash % buckets.length;
             Bucket bucket = buckets[index];

             foreach (e; bucket) {
                 if (e.key == key) { // Check with opEquals
                     return e.value;
                 }
             }
             throw new Exception("Key not found");
         }
     }

     unittest {
         AA!(S, int) a = AA!(S, int)(24);
         a[S(1)] = 3;
         a[S(2)] = 4;
         assert(a[S(1)] == 3);
         assert(a[S(2)] == 4);
     }
     ```

This omits plenty of details, but should give some idea how AAs 
work.

--
   Simen

May 18 2021

D Programming

C/C++ Programming

Other

digitalmars.D.learn - How is opEquals used in toHash