www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How is opEquals used in toHash

reply PinDPlugga <a a.com> writes:
In the solution to one of the exercises in Programming in D the 
unittests fail with respect to the toHash implementation.

Here is a link to the full solution provided:
https://run.dlang.io/gist/99ddf791f86aaa9d333d032166aadcb9?args=-unittest%20-main

and the link to the relevant section in the book:
https://ddili.org/ders/d.en/object.html

so while the unittest for equality passes, the next test with 
`in` is where the failure occurs:
```D
// ...
     assert(area1 == area2);    // Passes

     double[TriangularArea] areas;
     areas[area1] = 1.25;

     assert(area2 in areas); // Fails
// ...
```

the issue must be with the toHash function given
``` D
     override size_t toHash() const {
         /* Since the 'points' member is an array, we can take
          * advantage of the existing toHash algorithm for
          * array types. */
         return typeid(points).getHash(&points);
     }
```

I looked into the object library documentation and saw that the 
template getHash calls hashOf which calls 
core.internal.hash.hashOf etc.

But what I do not understand is why opEquals is necessary and 
where in the implementation of toHash it plays its role? Since 
area1 and area2 have different static arrays of Points I 
understand why `typeid(points).getHash(&points)` would produce 
different values for each, but it seems like the opEquals should 
play a part in making them produce the same hash.
May 18 2021
parent Simen =?UTF-8?B?S2rDpnLDpXM=?= <simen.kjaras gmail.com> writes:
On Tuesday, 18 May 2021 at 10:14:26 UTC, PinDPlugga wrote:
 But what I do not understand is why opEquals is necessary and 
 where in the implementation of toHash it plays its role? Since 
 area1 and area2 have different static arrays of Points I 
 understand why `typeid(points).getHash(&points)` would produce 
 different values for each, but it seems like the opEquals 
 should play a part in making them produce the same hash.
opEquals plays no part in how toHash does its thing, but it's important for how AAs work. When there's a hash collision, the colliding items are placed in a bucket, which is generally iterated linearly to find the matching element, using opEquals. Example code: ```d struct S { int i; size_t toHash() const nothrow safe { return 3; } bool opEquals(ref const S rhs) const { return i == rhs.i; } } unittest { int[S] a; a[S(1)] = 3; a[S(2)] = 4; // Hash collision - both return 3 assert(a.length == 2); // But they're both there assert(S(1) in a); assert(S(2) in a); int b = a[S(1)]; int c = a[S(2)]; assert(b == 3); assert(c == 4); } ``` And pseudocode for how an AA works: ```d struct AA(K, V) { import std.typecons : Tuple; alias Pair = Tuple!(K, "key", V, "value"); alias Bucket = Pair[]; Bucket[] buckets; this(int bucketCount) { buckets.length = bucketCount; } void opIndexAssign(V value, K key) { // Use hash to find correct bucket size_t hash = key.toHash(); size_t index = hash % buckets.length; Bucket bucket = buckets[index]; foreach (e; bucket) { if (e.key == key) { // Check for duplicates with opEquals e.value = value; // Overwrite existing return; } } bucket ~= Pair(key, value); // Array could reallocate when appended to, // so assign it back to the bucket list buckets[index] = bucket; } V opIndex(K key) { // Use hash to find correct bucket size_t hash = key.toHash(); size_t index = hash % buckets.length; Bucket bucket = buckets[index]; foreach (e; bucket) { if (e.key == key) { // Check with opEquals return e.value; } } throw new Exception("Key not found"); } } unittest { AA!(S, int) a = AA!(S, int)(24); a[S(1)] = 3; a[S(2)] = 4; assert(a[S(1)] == 3); assert(a[S(2)] == 4); } ``` This omits plenty of details, but should give some idea how AAs work. -- Simen
May 18 2021