www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Object.opEquals, opCmp, toHash

reply Walter Bright <newshound2 digitalmars.com> writes:
These all need to be:

     const pure nothrow  safe

Unless this is done, the utility of const, pure, nothrow and  safe is rather 
crippled.

Any reason why they shouldn't be?

One reason is memoization, aka lazy initialization, aka logical const. I don't 
believe these are worth it. If you must do it inside those functions (and 
functions that override them), you're on your own to make it work right (use 
unsafe casts, etc.).
Feb 16 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, February 16, 2012 00:35:20 Walter Bright wrote:
 These all need to be:
 
      const pure nothrow  safe
 
 Unless this is done, the utility of const, pure, nothrow and  safe is rather
 crippled.
 
 Any reason why they shouldn't be?
 
 One reason is memoization, aka lazy initialization, aka logical const. I
 don't believe these are worth it. If you must do it inside those functions
 (and functions that override them), you're on your own to make it work
 right (use unsafe casts, etc.).

I think that that's essentially the conclusion that we as a group have come to in discussions on it in the past. The exception are those folks who don't want any of those functions to be const, because that cripples caching, lazy initialization, etc for classes. And so they argue against the whole idea. But we _have_ to make them const or, as you say, const is horribly crippled. It's one area where D's const is rather costly, but I don't see any way around it. toString is another one, though making that pure right now just wouldn't work, because so little string stuff can pure pure (e.g. primarily because format, to, and appender aren't pure). But that really needs to be fixed anyway. - Jonathan M Davis
Feb 16 2012
prev sibling next sibling parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
I agree.  What about shared?

"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:jhif48$1n0n$1 digitalmars.com...
 These all need to be:

     const pure nothrow  safe

 Unless this is done, the utility of const, pure, nothrow and  safe is 
 rather crippled.

 Any reason why they shouldn't be?

 One reason is memoization, aka lazy initialization, aka logical const. I 
 don't believe these are worth it. If you must do it inside those functions 
 (and functions that override them), you're on your own to make it work 
 right (use unsafe casts, etc.). 

Feb 16 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, February 16, 2012 20:30:19 Daniel Murphy wrote:
 I agree.  What about shared?

Wouldn't that require an overload? You certainly couldn't make any of them shared by default, because then they wouldn't work with non-shared objects. - Jonathan M Davis
Feb 16 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/16/2012 1:30 AM, Daniel Murphy wrote:
 What about shared?

No! Shared is a special animal, and the user will still have to take care to deal with synchronization issues.
Feb 16 2012
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
I mean, are we going to have a shared overload in Object for each of those 
functions?

"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:jhikus$227j$1 digitalmars.com...
 On 2/16/2012 1:30 AM, Daniel Murphy wrote:
 What about shared?

No! Shared is a special animal, and the user will still have to take care to deal with synchronization issues.

Feb 16 2012
next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 16/02/2012 12:12, Daniel Murphy a écrit :
 I mean, are we going to have a shared overload in Object for each of those
 functions?

I'd argue that in some cases, the sompiler should be able to generate a shared const function automatically from the const function. This has some limitation. I'm currently writting an article about that possibility and what are the limitations. BTW, shared is - sadly - almost a stub right now, so it doesn't really matter.
 "Walter Bright"<newshound2 digitalmars.com>  wrote in message
 news:jhikus$227j$1 digitalmars.com...
 On 2/16/2012 1:30 AM, Daniel Murphy wrote:
 What about shared?

No! Shared is a special animal, and the user will still have to take care to deal with synchronization issues.


Feb 16 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/16/2012 3:12 AM, Daniel Murphy wrote:
 I mean, are we going to have a shared overload in Object for each of those
 functions?

No.
Feb 16 2012
prev sibling next sibling parent reply Don Clugston <dac nospam.com> writes:
On 16/02/12 09:35, Walter Bright wrote:
 These all need to be:

 const pure nothrow  safe

 Unless this is done, the utility of const, pure, nothrow and  safe is
 rather crippled.

 Any reason why they shouldn't be?

 One reason is memoization, aka lazy initialization, aka logical const. I
 don't believe these are worth it. If you must do it inside those
 functions (and functions that override them), you're on your own to make
 it work right (use unsafe casts, etc.).

And if memoization has problems with these functions being const, it will have problems elsewhere. They should be const, nothrow, safe. I'm less sure about pure, though. What if (for example) you have a struct which is just an index into a global table? Like a Windows handle, for example. That should still work.
Feb 16 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/16/2012 1:47 AM, Don Clugston wrote:
 I'm less sure about pure, though. What if (for example) you have a struct which
 is just an index into a global table? Like a Windows handle, for example. That
 should still work.

Without pure, associative arrays cannot work reliably. I'd suggest that if one is relying on a mutable global array, that one shouldn't use opEquals, opCmp, etc., and should use functions with other names.
Feb 16 2012
parent Don Clugston <dac nospam.com> writes:
On 16/02/12 11:16, Walter Bright wrote:
 On 2/16/2012 1:47 AM, Don Clugston wrote:
 I'm less sure about pure, though. What if (for example) you have a
 struct which
 is just an index into a global table? Like a Windows handle, for
 example. That
 should still work.

Without pure, associative arrays cannot work reliably.

That's a good argument.
 I'd suggest that if one is relying on a mutable global array, that one
 shouldn't use opEquals, opCmp, etc., and should use functions with other
 names.

I would say, that the solution would be a (library) implementation of logical pure.
Feb 16 2012
prev sibling next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 16/02/2012 08:35, Walter Bright wrote:
 These all need to be:

 const pure nothrow  safe

 Unless this is done, the utility of const, pure, nothrow and  safe is rather
crippled.

 Any reason why they shouldn't be?

 One reason is memoization, aka lazy initialization, aka logical const.

But if the method is pure, the compiler can automatically implement this as an optimisation. Stewart.
Feb 16 2012
parent reply bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:

 But if the method is pure, the compiler can automatically implement this as an
optimisation.

Functions like toHash take nothing and return a single size_t (hash_t). Often you want to compute the hash value lazily, but this is not possible if toHash needs to be pure. A explicit optional memoize annotation (similar to the std.functional.memoize) allows toHash to be both catching and safe. (I was also thinking about the idea of a trusted_pure, but I don't like it, and I think it causes chaos). Bye, bearophile
Feb 16 2012
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 16/02/2012 13:05, bearophile wrote:
 Stewart Gordon:

 But if the method is pure, the compiler can automatically implement this as an
optimisation.

Functions like toHash take nothing and return a single size_t (hash_t). Often you want to compute the hash value lazily, but this is not possible if toHash needs to be pure.

Hence my point. The laziness could be implemented on the compiler side, thereby bypassing the contracts of purity and constancy. For example, if the source code is string toString() const pure { return ...; } then the compiler would generate code equivalent to bool _has_cached_toString; string _cached_toString; string toString() { if (!_has_cached_toString) { _cached_toString = ...; _has_cached_toString = true; } return _cached_toString; } and moreover, clear the _has_cached_toString flag whenever any of the members on which the cached value depends is changed.
 A explicit optional  memoize annotation (similar to the
std.functional.memoize) allows

trusted_pure, but I don't like it, and I think it causes chaos).
 Bye,
 bearophile

Feb 16 2012
parent bearophile <bearophileHUGS lycos.com> writes:
Stewart Gordon:

 For example, if the source code is
 
      string toString() const pure {
          return ...;
      }
 
 then the compiler would generate code equivalent to
 
      bool _has_cached_toString;
      string _cached_toString;
 
      string toString() {
          if (!_has_cached_toString) {
              _cached_toString = ...;
              _has_cached_toString = true;
          }
          return _cached_toString;
      }

The purpose of using an explicit memoize is to offer the programmer the choice to enable or disable such caching. On default there is no caching. If toString() is requires only rarely and the string is large but quick to compute, you probably don't wait it to be cached, so you don't add memoize. If toHash is needed often, and its computation requires some time, given that it only requires one word of memory, you probably want to add memoize. The compiler is then free to implement memoize with a dictionary when there are arguments and with a bool+field when the memoized method has no arguments. Bye, bearophile
Feb 16 2012
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/16/2012 09:35 AM, Walter Bright wrote:
 These all need to be:

 const pure nothrow  safe

 Unless this is done, the utility of const, pure, nothrow and  safe is
 rather crippled.

 Any reason why they shouldn't be?

The utility of const, pure, nothrow and safe is already rather crippled when using Phobos. It is a rather small minority of my pure functions that are effectively annotated pure, because most of Phobos is not properly annotated and the inference for templates does not work satisfactorily yet. So imo, making opEquals, opCmp, toHash const pure nothrow safe is blocked by properly annotating Phobos as well as the following issues: Bugs in existing attribute inference: http://d.puremagic.com/issues/show_bug.cgi?id=7205 http://d.puremagic.com/issues/show_bug.cgi?id=7511 Enhancement required for making templates const correct: http://d.puremagic.com/issues/show_bug.cgi?id=7521
 One reason is memoization, aka lazy initialization, aka logical const. I
 don't believe these are worth it. If you must do it inside those
 functions (and functions that override them), you're on your own to make
 it work right (use unsafe casts, etc.).

It would be helpful if we could add cast(pure) for that purpose. The documentation could state that cast(pure) is only valid if the behaviour of the enclosing function still appears to be pure.
Feb 16 2012
next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 16/02/2012 16:52, Timon Gehr a écrit :
 On 02/16/2012 09:35 AM, Walter Bright wrote:
 These all need to be:

 const pure nothrow  safe

 Unless this is done, the utility of const, pure, nothrow and  safe is
 rather crippled.

 Any reason why they shouldn't be?

The utility of const, pure, nothrow and safe is already rather crippled when using Phobos. It is a rather small minority of my pure functions that are effectively annotated pure, because most of Phobos is not properly annotated and the inference for templates does not work satisfactorily yet. So imo, making opEquals, opCmp, toHash const pure nothrow safe is blocked by properly annotating Phobos as well as the following issues: Bugs in existing attribute inference: http://d.puremagic.com/issues/show_bug.cgi?id=7205 http://d.puremagic.com/issues/show_bug.cgi?id=7511 Enhancement required for making templates const correct: http://d.puremagic.com/issues/show_bug.cgi?id=7521

Good point. However, I don't think that should stop us. BTW, what should happen if we write a opComp that isn't nothrow, safe or pure ?
Feb 16 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/16/2012 7:52 AM, Timon Gehr wrote:
 So imo, making opEquals, opCmp, toHash const pure nothrow  safe is blocked by
 properly annotating Phobos as well as the following issues:

 Bugs in existing attribute inference:
 http://d.puremagic.com/issues/show_bug.cgi?id=7205
 http://d.puremagic.com/issues/show_bug.cgi?id=7511

 Enhancement required for making templates const correct:
 http://d.puremagic.com/issues/show_bug.cgi?id=7521

Yes, I want to get all that fixed.
 One reason is memoization, aka lazy initialization, aka logical const. I
 don't believe these are worth it. If you must do it inside those
 functions (and functions that override them), you're on your own to make
 it work right (use unsafe casts, etc.).

It would be helpful if we could add cast(pure) for that purpose. The documentation could state that cast(pure) is only valid if the behaviour of the enclosing function still appears to be pure.

cast(pure) sounds like a good idea.
Feb 16 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 16, 2012 at 12:35:20AM -0800, Walter Bright wrote:
 These all need to be:
 
     const pure nothrow  safe
 
 Unless this is done, the utility of const, pure, nothrow and  safe
 is rather crippled.
 
 Any reason why they shouldn't be?

Nope.
 One reason is memoization, aka lazy initialization, aka logical
 const. I don't believe these are worth it. If you must do it inside
 those functions (and functions that override them), you're on your
 own to make it work right (use unsafe casts, etc.).

This is a non-problem once the compiler implements memoization as an optimisation. Which it can't until we go ahead with this change. This is the direction that we *should* be going anyway, so why not do it now rather than later? T -- Latin's a dead language, as dead as can be; it killed off all the Romans, and now it's killing me! -- Schoolboy
Feb 16 2012
prev sibling next sibling parent "Paul D. Anderson" <paul.d.removethis.anderson comcast.andthis.net> writes:
On Thursday, 16 February 2012 at 08:35:20 UTC, Walter Bright 
wrote:
 These all need to be:

    const pure nothrow  safe

 Unless this is done, the utility of const, pure, nothrow and 
  safe is rather crippled.

 Any reason why they shouldn't be?

 One reason is memoization, aka lazy initialization, aka logical 
 const. I don't believe these are worth it. If you must do it 
 inside those functions (and functions that override them), 
 you're on your own to make it work right (use unsafe casts, 
 etc.).

+1
Feb 16 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, February 16, 2012 09:38:54 H. S. Teoh wrote:
 This is a non-problem once the compiler implements memoization as an
 optimisation. Which it can't until we go ahead with this change. This is
 the direction that we *should* be going anyway, so why not do it now
 rather than later?

I would point out that there are no plans to implement any kind of memoization in the language or compiler. Also, while it can help performance, it can also _harm_ performance. So having it controlled by the compiler is not necessarily a great idea anyway. It's really the sort of thing that should involve profiling on the part of the programmer. If you want memoization, use std.functional.memoize. - Jonathan M Davis
Feb 16 2012
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 16, 2012 at 01:53:46PM -0500, Jonathan M Davis wrote:
 On Thursday, February 16, 2012 09:38:54 H. S. Teoh wrote:
 This is a non-problem once the compiler implements memoization as an
 optimisation. Which it can't until we go ahead with this change.
 This is the direction that we *should* be going anyway, so why not
 do it now rather than later?

I would point out that there are no plans to implement any kind of memoization in the language or compiler. Also, while it can help performance, it can also _harm_ performance. So having it controlled by the compiler is not necessarily a great idea anyway. It's really the sort of thing that should involve profiling on the part of the programmer.

Then I agree with bearophile that we should have memoize (or its negation), so that the programmer can indicate to the compiler that the function should be memoized (or not). T -- It won't be covered in the book. The source code has to be useful for something, after all. -- Larry Wall
Feb 16 2012
parent Don <nospam nospam.com> writes:
On 16.02.2012 20:10, H. S. Teoh wrote:
 On Thu, Feb 16, 2012 at 01:53:46PM -0500, Jonathan M Davis wrote:
 On Thursday, February 16, 2012 09:38:54 H. S. Teoh wrote:
 This is a non-problem once the compiler implements memoization as an
 optimisation. Which it can't until we go ahead with this change.
 This is the direction that we *should* be going anyway, so why not
 do it now rather than later?

I would point out that there are no plans to implement any kind of memoization in the language or compiler. Also, while it can help performance, it can also _harm_ performance. So having it controlled by the compiler is not necessarily a great idea anyway. It's really the sort of thing that should involve profiling on the part of the programmer.

Then I agree with bearophile that we should have memoize (or its negation), so that the programmer can indicate to the compiler that the function should be memoized (or not).

Unfortunately, that's too simple. Although the compiler can memoize a few simple cases, it can't do it efficiently in general. Sometimes it makes sense to store the memoized result in the object, sometimes separately. Sometimes only certain cases should be memoized -- the question of whether to memoize or not may depend on the value of the object (you only want to memoize the complicated cases). It's too hard for the poor compiler.
Feb 16 2012
prev sibling parent "foobar" <foo bar.com> writes:
On Thursday, 16 February 2012 at 21:05:19 UTC, Don wrote:
 On 16.02.2012 20:10, H. S. Teoh wrote:
 On Thu, Feb 16, 2012 at 01:53:46PM -0500, Jonathan M Davis 
 wrote:
 On Thursday, February 16, 2012 09:38:54 H. S. Teoh wrote:
 This is a non-problem once the compiler implements 
 memoization as an
 optimisation. Which it can't until we go ahead with this 
 change.
 This is the direction that we *should* be going anyway, so 
 why not
 do it now rather than later?

I would point out that there are no plans to implement any kind of memoization in the language or compiler. Also, while it can help performance, it can also _harm_ performance. So having it controlled by the compiler is not necessarily a great idea anyway. It's really the sort of thing that should involve profiling on the part of the programmer.

Then I agree with bearophile that we should have memoize (or its negation), so that the programmer can indicate to the compiler that the function should be memoized (or not).

Unfortunately, that's too simple. Although the compiler can memoize a few simple cases, it can't do it efficiently in general. Sometimes it makes sense to store the memoized result in the object, sometimes separately. Sometimes only certain cases should be memoized -- the question of whether to memoize or not may depend on the value of the object (you only want to memoize the complicated cases). It's too hard for the poor compiler.

I'll add to Don's comment a question: Compile-time memoization means the decision is static. Shouldn't the decision tough be made instead at run-time to account for the run-time characteristics of the process? E.g. memoize the frequent case. This sounds to me more in the realm of a JVM optimization rather than a compiler one.
Feb 16 2012