www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 9771] New: Remove toHash from Object

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771

           Summary: Remove toHash from Object
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: druntime
        AssignedTo: nobody puremagic.com
        ReportedBy: jmdavisProg gmx.com


--- Comment #0 from Jonathan M Davis <jmdavisProg gmx.com> 2013-03-21 00:03:37
PDT ---
Per this thread http://forum.dlang.org/post/jtlj1k$1fdj$1 digitalmars.com , it
was decided to try and go the path of removing opEquals, opCmp, toHash, and
toString from Object, as they don't need to be there, and the fact that they're
there is causing issues with attributes such as const ( issue# 1824 ).

This issue is specifically for toHash so that any work towards making it
unnecessary and removing it can be referenced here, whereas the other 3 have
separate bugzilla entries, as the work for them isn't necessarily related to
each other. opEquals, opCmp, and toHash in particular may require substantial
work to be done on the built-in AAs before they can actually be removed from
Object, as the AA implementation is not currently templated like it should be.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 21 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #1 from Jonathan M Davis <jmdavisProg gmx.com> 2013-03-21 00:05:52
PDT ---
Related:

opEquals: issue# 9769
opCmp: issue# 9770
toString: issue# 9772

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 21 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771


hsteoh quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hsteoh quickfur.ath.cx


--- Comment #2 from hsteoh quickfur.ath.cx 2013-03-22 18:50:04 PDT ---
This probably can't be implemented until we replace AA's with a library
template, since otherwise it would make it impossible for e.g. classes to be
used as AA keys.

Once we have the library type, though, we can probably make use of UFCS to
provide default implementations of toHash for all types.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #3 from Jonathan M Davis <jmdavisProg gmx.com> 2013-03-22 19:09:41
PDT ---
 This probably can't be implemented until we replace AA's with a library
 template

I figured that that was probably the case, and that will certainly slow this down, but it still needs to be done.
 Once we have the library type, though, we can probably make use of UFCS to
 provide default implementations of toHash for all types.

I'd be very much against providing a UFCS toHash function which used compile-time reflection to figure out how to generate a hash for an arbitrary object. That's just begging for trouble, particularly since the behavior of opEquals and toHash need to match. They need to be controlled by the object itself and not 3rd party code. Providing a mixin to which does that would definitely be useful for avoiding boilerplate, but then it's properly under the control of the person writing the class or struct. But providing toHash functions for built-in types via CTFE would certainly be good. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #4 from hsteoh quickfur.ath.cx 2013-03-22 19:28:25 PDT ---
I wouldn't go as far as using introspection to make a default toHash, but I
think there's value in providing a default toHash for classes that just hashes
the object's address (I think that's what the current toHash does).

But if we're going to be forcing users to manually define opEquals and toHash,
then we should make rt.util.hash available to user code. The only thing worse
than having a wrong default opEquals/toHash is a morass of duplicated user code
implementing poorly-designed hashing algorithms for each user-defined type.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #5 from hsteoh quickfur.ath.cx 2013-03-22 19:30:38 PDT ---
Also, if the default behaviour of == for structs is to do bitwise comparison,
then I think it justifies defining toHash for structs that just hashes the bit
representation of the struct. It certainly beats having to define toHash for
every single little helper struct that you write, just so you can use it in an
AA. (That's what really turned me off in C++11's unordered_map -- so much
verbosity for something that's supposed to be simple.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #6 from hsteoh quickfur.ath.cx 2013-03-22 19:32:00 PDT ---
P.S. I meant *default* toHash for structs, of course. Obviously the user should
be able to override the default behaviour (hence the usefulness of using UFCS
for toHash, since a member function always takes precedence over a UFCS
function).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #7 from Jonathan M Davis <jmdavisProg gmx.com> 2013-03-22 20:29:03
PDT ---
Whether it makes sense or not to use the bit representation for hashing depends
on the struct, as according TDPL, bitwise comparison is _not_ supposed to be
the default for structs (though as bug# 3789 indicates, that's currently
buggy). However, it wouldn't be all that hard to come up with a default toHash
which corresponded to the default opEquals by having it simply use the toHash
methods of the member variables to generate an overall hash. So, we could
definitely make it so that toHash is generated for structs as long as opEquals
hasn't been defined.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 22 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771


Jacob Carlborg <doob me.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |doob me.com


--- Comment #8 from Jacob Carlborg <doob me.com> 2013-03-23 07:34:51 PDT ---
(In reply to comment #4)
 I wouldn't go as far as using introspection to make a default toHash, but I
 think there's value in providing a default toHash for classes that just hashes
 the object's address (I think that's what the current toHash does).

I think that will conflict with having a moving garbage collector. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #9 from hsteoh quickfur.ath.cx 2013-03-23 08:20:16 PDT ---
(In reply to comment #8)
 I think that will conflict with having a moving garbage collector.

Good point, maybe the correct approach is to hash the respective hashes of each member of the class? Jonathan: I agree, the default toHash should not be generated if the class/struct defines opEquals. It's the user's responsibility to keep the two consistent. So basically, the default toHash should correspond to whatever == corresponds with by default. If == is bitwise comparison, then toHash also should be bitwise hash, etc.. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #10 from Jacob Carlborg <doob me.com> 2013-03-23 08:33:35 PDT ---
(In reply to comment #9)
 Good point, maybe the correct approach is to hash the respective hashes of each
 member of the class?

Jonathan didn't seem to like that idea. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #11 from Jonathan M Davis <jmdavisProg gmx.com> 2013-03-23 12:09:46
PDT ---
 Jonathan didn't seem to like that idea.

Given that the default opEquals is supposed to compare each member variable using ==, I think that it would make good sense for the default toHash to call toHash on each member variable and generate a new hash from those. What doesn't make sense is blindly using reflection to go over all of member variables and generating hashes for them regardless of how they're implemented. Some folks in the newsgroup liked the idea of using external functions to use reflection to go through a type's member variables recursively to generate a hash, and given how opEquals and toHash are related, I think that that's an extremely bad idea, and that's the main thing that I'm objecting to. But if we're talking about a default-generated toHash that's only generated when opEquals is default-generated, then we know some things about what opEquals looks like and can generate a toHash which correctly corresponds to it, which would generally mean calling toHash on each member variable and generating a new hash from that - though if the compiler decided that it could do a bitwise comparison of the struct (e.g. because all of the member variables were integral types), then generating a hash in a similar manner would make sense. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #12 from Jacob Carlborg <doob me.com> 2013-03-23 15:29:36 PDT ---
(In reply to comment #11)

 Given that the default opEquals is supposed to compare each member variable
 using ==, I think that it would make good sense for the default toHash to call
 toHash on each member variable and generate a new hash from those. What doesn't
 make sense is blindly using reflection to go over all of member variables and
 generating hashes for them regardless of how they're implemented. Some folks in
 the newsgroup liked the idea of using external functions to use reflection to
 go through a type's member variables recursively to generate a hash, and given
 how opEquals and toHash are related, I think that that's an extremely bad idea,
 and that's the main thing that I'm objecting to. But if we're talking about a
 default-generated toHash that's only generated when opEquals is
 default-generated, then we know some things about what opEquals looks like and
 can generate a toHash which correctly corresponds to it, which would generally
 mean calling toHash on each member variable and generating a new hash from that
 - though if the compiler decided that it could do a bitwise comparison of the
 struct (e.g. because all of the member variables were integral types), then
 generating a hash in a similar manner would make sense.

I see, so only when opEquals is default generated. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #13 from Jonathan M Davis <jmdavisProg gmx.com> 2013-03-23 19:12:53
PDT ---
 I see, so only when opEquals is default generated.

Yes. We could provide helper functions or mixins to make it easier to write hash functions if you define opEquals (or on classes which won't have any kind of default opEquals - let alone the useless one that they have now), but as soon as opEquals has been defined by the programmer, it has to be up to the programmer to define toHash correctly, as in order for hashing to work correctly, every object which is equal must have the same hash. And if the programmer defined opEquals, then the compiler will no longer have enough information to define a toHash which guarantees that all equal objects will have the same hash. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 23 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771


Martin Nowak <code dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |code dawg.eu


--- Comment #14 from Martin Nowak <code dawg.eu> 2013-06-03 08:15:21 PDT ---
(In reply to comment #2)
 This probably can't be implemented until we replace AA's with a library
 template, since otherwise it would make it impossible for e.g. classes to be
 used as AA keys.

That very simple to solve, let the compiler pass two functions (hash, equals) to the AA implementation instead of RTTI.
 Once we have the library type, though, we can probably make use of UFCS to
 provide default implementations of toHash for all types.

I would not want to bet that this will happen. The current situation where the AA is partly implemented in the compiler and in the runtime is unfortunate to say the least. Also performance-wise a generic AA implementation will never be on par with one that's suited for a special purpose. So it makes a lot of sense to have a lean builtin implementation and provide more configurable ones in std.container. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 03 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9771



--- Comment #15 from Jonathan M Davis <jmdavisProg gmx.com> 2013-10-13 16:17:28
PDT ---
Relevant discussion on how to transition away from having these functions on
Object (the thread got broken up a bit - probably by the mailman bug):

http://forum.dlang.org/post/mailman.214.1369617617.13711.digitalmars-d puremagic.com
http://forum.dlang.org/post/mailman.280.1369712394.13711.digitalmars-d puremagic.com

The key suggestion is to change the compiler so that those 4 functions can be
free functions in object_.d, similar to how the compiler already does special
stuff for == rather than simply calling Object's opEquals.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 13 2013