www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Default hashing function for AA's

reply RazvanN <razvan.nitu1305 gmail.com> writes:
Hi all,

We in the UPB dlang group have been having discussions about the 
hashing functions of associative arrays. In particular, we were 
wondering why is the AA implementation in druntime is not using 
the hash function implemented in 
druntime/src/core/internal/hash.hashOf for classes that don't 
define toHash().

For us, that seems to be a very good default hashing function.

Best regards,
On behalf of the UPB Dlang group
Oct 09 2017
next sibling parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 9 October 2017 at 14:11:13 UTC, RazvanN wrote:
 We in the UPB dlang group have been having discussions about 
 the hashing functions of associative arrays. In particular, we 
 were wondering why is the AA implementation in druntime is not 
 using the hash function implemented in 
 druntime/src/core/internal/hash.hashOf for classes that don't 
 define toHash().

 For us, that seems to be a very good default hashing function.
Further, I haven't found any instructions on changing the default hash-digest for `hashOf`. Is this in conflict with `hashOf` being `pure`? Could the interface to builtin AA's be extended to support changing the default hash algorithm (which in turn `hashOf` will use) upon AA instantiation?
Oct 09 2017
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 10/9/17 10:11 AM, RazvanN wrote:
 Hi all,
 
 We in the UPB dlang group have been having discussions about the hashing 
 functions of associative arrays. In particular, we were wondering why is 
 the AA implementation in druntime is not using the hash function 
 implemented in druntime/src/core/internal/hash.hashOf for classes that 
 don't define toHash().
AA uses typeid(Key).getHash. [1] For objects, this calls the virtual function `toHash`. [2] Please keep in mind that all you are hashing is the class reference pointer, as that is the default comparison for `opEquals`. It might make sense to shuffle those bits a bit, since the bucket algorithm only looks at the lower bits of the hash, and this basically guarantees keys with the default hash will be empty in the first few buckets, since class objects are aligned on a power-of-2 boundary. But I'm not sure running a full blown hash on the pointer is necessary. Perhaps just xor-ing the upper bits with the lower bits makes more sense. Alternatively, you could change the default `opEquals` but that may break things more than expected. What you *can't* do is change what the hash is based on without changing `opEquals`. Note that I wouldn't recommend using an Object as a key without defining toHash and opEquals anyway. -Steve [1] https://github.com/dlang/druntime/blob/master/src/rt/aaA.d#L310 [2] https://github.com/dlang/druntime/blob/master/src/object.d#L66
Oct 10 2017
parent =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2017-10-10 15:22:05 +0000, Steven Schveighoffer said:

 AA uses typeid(Key).getHash. [1]
 
 For objects, this calls the virtual function `toHash`. [2]
 
 Please keep in mind that all you are hashing is the class reference 
 pointer, as that is the default comparison for `opEquals`. It might 
 make sense to shuffle those bits a bit, since the bucket algorithm only 
 looks at the lower bits of the hash, and this basically guarantees keys 
 with the default hash will be empty in the first few buckets, since 
 class objects are aligned on a power-of-2 boundary.
 
 But I'm not sure running a full blown hash on the pointer is necessary. 
 Perhaps just xor-ing the upper bits with the lower bits makes more 
 sense.
 
 Alternatively, you could change the default `opEquals` but that may 
 break things more than expected.
 
 What you *can't* do is change what the hash is based on without 
 changing `opEquals`.
 
 Note that I wouldn't recommend using an Object as a key without 
 defining toHash and opEquals anyway.
 
 -Steve
 
 [1] https://github.com/dlang/druntime/blob/master/src/rt/aaA.d#L310
 [2] https://github.com/dlang/druntime/blob/master/src/object.d#L66
Not sure this thread fits or a new one would be better. Anyway, here is an interesting article about a very fast hashtable and maybe it's a good candidate for the default AA implementation in D. IMO providing a very performant AA implementation is a big asset: https://probablydance.com/2018/05/28/a-new-fast-hash-table-in-response-to-googles- ew-fast-hash-table/ -- Robert M. Münch http://www.saphirion.com smarter | better | faster
May 31 2018