digitalmars.D - Structs as Keys for AAs

Q. Schroll (30/34) Aug 09 2017 I'm not completely sure what that means. Does "specialized" mean

H. S. Teoh via Digitalmars-d (37/74) Aug 09 2017 The main issue is that the current AA implementation uses TypeInfo's
Steven Schveighoffer (15/53) Aug 09 2017 You need to read it with the other part "but omit a specialized

Q. Schroll <qs.il.paperinik gmail.com> writes:

In [1] it says at 5. that

 For this reason, and for legacy reasons, an associative array 
 key is not allowed to define a specialized opCmp, but omit a 
 specialized opEquals. This restriction may be removed in future 
 versions of D.

I'm not completely sure what that means. Does "specialized" mean 
"user-defined"? I just challenged the spec and found an error by 
the way: [2]. Apart from that, it compiles.

For 5. I used

struct Key
{
     int id;
     string tag;

     int opCmp(const Key other) const
     {
         return this.id < other.id ? -1 : this.id == other.id ?  0 
: 1;
     }

     bool opEquals(ref const Key other) const  safe pure nothrow
     {
         return this.id == other.id;
     }

     size_t toHash() const  safe pure nothrow
     {
         return id;
     }
}

as a key type. To me the part "is not allowed to define a 
specialized opCmp" is clearly wrong, either a compiler bug or an 
error in the spec.

Concerning opEquals and opCmp in general: Why isn't opEquals 
lowered to opCmp returning 0 if not present?

[1] https://dlang.org/spec/hash-map.html#using_struct_as_key
[2] https://github.com/dlang/dlang.org/pull/1861

Aug 09 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Wed, Aug 09, 2017 at 11:31:44PM +0000, Q. Schroll via Digitalmars-d wrote:
 In [1] it says at 5. that
 
 For this reason, and for legacy reasons, an associative array key is
 not allowed to define a specialized opCmp, but omit a specialized
 opEquals.  This restriction may be removed in future versions of D.

 
 I'm not completely sure what that means. Does "specialized" mean
 "user-defined"? I just challenged the spec and found an error by the
 way: [2]. Apart from that, it compiles.

The main issue is that the current AA implementation uses TypeInfo's
.equal method for comparing keys. (This dates from before D has
templates, and also because at the time, the AA implementation was too
tightly bound to the compiler and couldn't be easily changed.) The way
TypeInfo works, .equal will not do the right thing if you only define
opCmp without also defining opEquals.

This is why the spec was updated to say that you must essentially always
define opEquals if you define opCmp. Otherwise, you may get strange or
wrong behaviour when using the type as an AA key.


 For 5. I used
 
 struct Key
 {
     int id;
     string tag;
 
     int opCmp(const Key other) const
     {
         return this.id < other.id ? -1 : this.id == other.id ?  0 : 1;
     }
 
     bool opEquals(ref const Key other) const  safe pure nothrow
     {
         return this.id == other.id;
     }
 
     size_t toHash() const  safe pure nothrow
     {
         return id;
     }
 }
 
 as a key type. To me the part "is not allowed to define a specialized
 opCmp" is clearly wrong, either a compiler bug or an error in the
 spec.

Did you test the result at runtime with a real AA?  A telling sign would
be if you added the same Key twice, then iterate over the AA to print
out the entries.  You may get the same key more than once, which would
be indicative of this problem.

Just because the compiler accepts the code, doesn't necessarily mean
it's right. (Of course, it's arguably a bug that the compiler accepts it
in the first place. But the AA implementation in the compiler is a bit
fragile to handle at the moment, I'm not sure if it can be easily fixed
without causing other problems.  We'll have to wait for Martin's library
AA to get merged...)


 Concerning opEquals and opCmp in general: Why isn't opEquals lowered to
 opCmp returning 0 if not present?

[...]

Mainly (1) for efficiency, and also, (2) to paraphrase Andrei, to allow
partial orders that may not be linear.

(1) because some types may have expensive computations to determine
whether something is bigger or smaller, but trivial to determine
equality.

(2) because conceivably you can implement the subset relation in opCmp
with a type that represents a set, so opCmp()==0 could mean the sets are
equal, OR it could mean the sets are not subsets of each other (they are
either disjoint, or have elements not in common with each other). Then
you'd need opEquals() to tell you whether or not they are equal.  For a
less exotic example, consider a custom floating-point type where NAN <
NAN and NAN > NAN are both false, so opCmp()==0 is the only reasonable
return value, yet opEquals() also == 0 because NAN != NAN.


T

-- 
INTEL = Only half of "intelligence".

Aug 09 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 8/9/17 7:31 PM, Q. Schroll wrote:
 In [1] it says at 5. that
 
 For this reason, and for legacy reasons, an associative array key is 
 not allowed to define a specialized opCmp, but omit a specialized 
 opEquals. This restriction may be removed in future versions of D.

 
 I'm not completely sure what that means. Does "specialized" mean 
 "user-defined"?

Yes, that's what it means.

 I just challenged the spec and found an error by the 
 way: [2]. Apart from that, it compiles.
 
 For 5. I used
 
 struct Key
 {
      int id;
      string tag;
 
      int opCmp(const Key other) const
      {
          return this.id < other.id ? -1 : this.id == other.id ?  0 : 1;
      }
 
      bool opEquals(ref const Key other) const  safe pure nothrow
      {
          return this.id == other.id;
      }
 
      size_t toHash() const  safe pure nothrow
      {
          return id;
      }
 }
 
 as a key type. To me the part "is not allowed to define a specialized 
 opCmp" is clearly wrong, either a compiler bug or an error in the spec.

You need to read it with the other part "but omit a specialized 
opEquals". In other words, you must implement opEquals if you implement 
opCmp. The reason is simple, because opEquals defaults to a comparison 
of all fields, and most likely if you are defining opCmp, it won't match 
the default opEquals.

opHash uses opEquals, but does not use opCmp. Therefore, if this 
restriction wasn't in place, then you may just define opCmp thinking the 
AA would use it.

Note that in your example, your opEquals is more efficient than opCmp == 
0. This is the main reason opEquals is defined differently than opCmp.

 Concerning opEquals and opCmp in general: Why isn't opEquals lowered to 
 opCmp returning 0 if not present?

It really should IMO, but that's not how it works. I'm almost positive 
there's an enhancement request on this somewhere.

-Steve

Aug 09 2017

D Programming

C/C++ Programming

Other

digitalmars.D - Structs as Keys for AAs