www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How do I obtain the default hash of a user-defined struct

reply "dnspies" <dspies ualberta.ca> writes:
How can I get the default-hash of a struct I've defined (to be 
used as part of the hash for some containing type)?
Apr 02 2014
parent reply "FreeSlave" <freeslave93 gmail.com> writes:
On Wednesday, 2 April 2014 at 20:14:31 UTC, dnspies wrote:
 How can I get the default-hash of a struct I've defined (to be 
 used as part of the hash for some containing type)?
UserDefined userDefined; writeln(typeid(UserDefined).getHash(&userDefined)); Probably there is a better way. I don't like to call typeid for this purpose.
Apr 02 2014
next sibling parent reply "dnspies" <dspies ualberta.ca> writes:
Thanks,
Actually I'm realizing there's a lot I'm unclear about when it 
comes to default comparison, equality, hashing, etc.

If my struct contains a dynamic array, are the contents of the 
array compared by default, or just the pointers/lengths?

Also, when two arrays are compared for content, are their 
pointers compared first in case they happen to be the same so the 
deep comparison can be shortcut?
Also, how do I obtain the default hash of a dynamic array ('s 
contents)?  Is there one?  I assume there must be since 
associative arrays can take string as a key-type.

To make a struct a valid key type, do I need to implement both 
opCmp and opEquals or just one or the other.  It says on the page 
about Associative Arrays: "The implementation may use either 
opEquals or opCmp or both."  Does that mean it uses whichever one 
is user-defined (or both if they're both user-defined)?  Or does 
it mean the user is responsible for defining both?

Also, it says "Care should be taken so that the results of 
opEquals and opCmp are consistent with each other when the 
struct/union objects are the same or not.", certainly this means 
that if a.opEquals(b), then a.opCmp(b) should be 0, but does the 
converse have to be true?

Is there somewhere I can find information about default operator 
implementations and consistency?
Apr 02 2014
parent reply "FreeSlave" <freeslave93 gmail.com> writes:
Contents of struct are compared field by field using comparison 
for the type of each field. Dynamic arrays are compared by 
contents. If you want to compare them by pointer use .ptr 
property.

opEquals and opCmp are not about hashing, I believe. They are 
just operators to help when dealing with chaining when different 
objects have same hash (since hash may be not unique)
Apr 02 2014
parent "dnspies" <dspies ualberta.ca> writes:
On Wednesday, 2 April 2014 at 22:07:36 UTC, FreeSlave wrote:
 Contents of struct are compared field by field using comparison 
 for the type of each field. Dynamic arrays are compared by 
 contents. If you want to compare them by pointer use .ptr 
 property.

 opEquals and opCmp are not about hashing, I believe. They are 
 just operators to help when dealing with chaining when 
 different objects have same hash (since hash may be not unique)
Thanks, I'll post the other questions in separate threads.
Apr 02 2014
prev sibling parent reply "dnspies" <dspies ualberta.ca> writes:
On Wednesday, 2 April 2014 at 20:39:47 UTC, FreeSlave wrote:
 On Wednesday, 2 April 2014 at 20:14:31 UTC, dnspies wrote:
 How can I get the default-hash of a struct I've defined (to be 
 used as part of the hash for some containing type)?
UserDefined userDefined; writeln(typeid(UserDefined).getHash(&userDefined)); Probably there is a better way. I don't like to call typeid for this purpose.
This doesn't work. It prints two different hashes for equal objects. I meant how do I get the default hash which is used by an associative array. import std.stdio; struct my_struct { int[] arr; } void main() { my_struct s1; s1.arr = [1,2,3]; my_struct s2; s2.arr = [1,2,3]; writeln(s1 == s2); writeln(typeid(my_struct).getHash(&s1)); writeln(typeid(my_struct).getHash(&s2)); } true 626617119 2124658624
Apr 03 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
dnspies:

 This doesn't work.  It prints two different hashes for equal 
 objects.  I meant how do I get the default hash which is used 
 by an associative array.

 import std.stdio;

 struct my_struct {
 	int[] arr;
 }

 void main() {
 	my_struct s1;
 	s1.arr = [1,2,3];
 	my_struct s2;
 	s2.arr = [1,2,3];
 	writeln(s1 == s2);
 	writeln(typeid(my_struct).getHash(&s1));
 	writeln(typeid(my_struct).getHash(&s2));
 }

 true
 626617119
 2124658624
Take a look at the output of this program: import std.stdio; struct MyStruct { int[] arr; } void main() { MyStruct s1; s1.arr = [1,2,3]; MyStruct s2; s2.arr = [1,2,3]; writeln(s1 == s2); writeln(typeid(MyStruct).getHash(&s1)); writeln(typeid(MyStruct).getHash(&s2)); int[MyStruct] aa; aa[s1] = 10; aa[s2] = 20; writeln(aa); } I have filed this big problem four years ago or more. Workarounds: define (carefully!) the three hash protocol methods, or use a tuple. Bye, bearophile
Apr 03 2014
next sibling parent reply "dnspies" <dspies ualberta.ca> writes:
On Thursday, 3 April 2014 at 21:42:18 UTC, bearophile wrote:
 dnspies:

 This doesn't work.  It prints two different hashes for equal 
 objects.  I meant how do I get the default hash which is used 
 by an associative array.

 import std.stdio;

 struct my_struct {
 	int[] arr;
 }

 void main() {
 	my_struct s1;
 	s1.arr = [1,2,3];
 	my_struct s2;
 	s2.arr = [1,2,3];
 	writeln(s1 == s2);
 	writeln(typeid(my_struct).getHash(&s1));
 	writeln(typeid(my_struct).getHash(&s2));
 }

 true
 626617119
 2124658624
Take a look at the output of this program: import std.stdio; struct MyStruct { int[] arr; } void main() { MyStruct s1; s1.arr = [1,2,3]; MyStruct s2; s2.arr = [1,2,3]; writeln(s1 == s2); writeln(typeid(MyStruct).getHash(&s1)); writeln(typeid(MyStruct).getHash(&s2)); int[MyStruct] aa; aa[s1] = 10; aa[s2] = 20; writeln(aa); } I have filed this big problem four years ago or more. Workarounds: define (carefully!) the three hash protocol methods, or use a tuple. Bye, bearophile
Oh so the problem isn't that that ISN'T the default hash used by an AA. It's worse. The problem is that that IS the default hash used by an AA. When you say "use a tuple", do you mean that the hash implementation for Tuples is defined recursively and based on its members' hashes? There doesn't seem to be an opHash for Tuples AFAICT. If not, could you provide an example? Also, if this problem isn't going to be fixed any time soon, shouldn't it be documented directly on the AA page somewhere? It's just the sort of surprise that I would have no hope of figuring out when trying to debug my program. (I put this wrapped_string into my AA, but when I try to fetch it again, it's disappeared!!??) What's worse, there's no link provided to TypeInfo.getHash, instead it just says "If the KeyType is a struct or union type, a default mechanism is used to compute the hash and comparisons of it based on the binary data within the struct value" which sounds as though they're saying "don't worry, we've handled everything" which is the opposite of true.
Apr 03 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
dnspies:

 Oh so the problem isn't that that ISN'T the default hash used 
 ...
 everything" which is the opposite of true.
You can post an elaboration of this in the main D newsgroup. Bye, bearophile
Apr 03 2014
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 03 Apr 2014 17:42:16 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:


 I have filed this big problem four years ago or more.
Bug report? -Steve
Apr 03 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Bug report?
I don't remember, sorry, it's an ancient problem. Probably one of my top three D problems :-) Bye, bearophile
Apr 03 2014
prev sibling parent reply "dnspies" <dspies ualberta.ca> writes:
On Thursday, 3 April 2014 at 23:01:27 UTC, Steven Schveighoffer 
wrote:
 On Thu, 03 Apr 2014 17:42:16 -0400, bearophile 
 <bearophileHUGS lycos.com> wrote:


 I have filed this big problem four years ago or more.
Bug report? -Steve
This is the closest I could find: https://d.puremagic.com/issues/show_bug.cgi?id=11025 Here's a couple other related bugs: https://d.puremagic.com/issues/show_bug.cgi?id=12516 https://d.puremagic.com/issues/show_bug.cgi?id=10374 https://d.puremagic.com/issues/show_bug.cgi?id=1926
Apr 04 2014
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Apr 04, 2014 at 04:48:52PM +0000, dnspies wrote:
 On Thursday, 3 April 2014 at 23:01:27 UTC, Steven Schveighoffer wrote:
On Thu, 03 Apr 2014 17:42:16 -0400, bearophile <bearophileHUGS lycos.com>
wrote:


I have filed this big problem four years ago or more.
Bug report? -Steve
This is the closest I could find: https://d.puremagic.com/issues/show_bug.cgi?id=11025 Here's a couple other related bugs: https://d.puremagic.com/issues/show_bug.cgi?id=12516 https://d.puremagic.com/issues/show_bug.cgi?id=10374 https://d.puremagic.com/issues/show_bug.cgi?id=1926
I just found this related issue: import std.stdio; struct MyKey { int a; char[] b; } void main() { auto key1 = MyKey(1, "abc".dup); writefln("key1 hash = %x", typeid(typeof(key1)).getHash(&key1)); char[] sneaky = "def".dup; key1.b[] = sneaky[]; // N.B.: change array contents, keep same pointer writefln("key1 hash = %x", typeid(typeof(key1)).getHash(&key1)); } Output: key1 hash = 6cba62173367a870 key1 hash = 6cba62173367a870 This means that the hash of MyKey is computed based on its binary representation, disregarding the contents of any array (and other reference) fields. This will certainly break AA's. I'm almost certain this has already been reported as a bug, but I vaguely remember someone mentioning a while back that this is supposed to have been fixed. But I still get the above problem in DMD git HEAD. :-( T -- GEEK = Gatherer of Extremely Enlightening Knowledge
Apr 04 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

 This means that the hash of MyKey is computed based on its 
 binary
 representation, disregarding the contents of any array (and 
 other
 reference) fields. This will certainly break AA's.

 I'm almost certain this has already been reported as a bug, but 
 I
 vaguely remember someone mentioning a while back that this is 
 supposed
 to have been fixed. But I still get the above problem in DMD 
 git HEAD.
 :-(
It needs to be fixed. (Or the code should not compile). Bye, bearophile
Apr 04 2014