www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to implement opCmp?

reply Honey <honey pot.com> writes:
Hi everyone!

Given

struct Pair(T, U = T)
{
   T f;
   U s;
}

what is the intended way to genrically implement opCmp for this 
struct?

The naive approach

struct Pair(T, U = T)
{
   // [...]

   int opCmp(const Pair r) const
   {
     immutable c = f.opCmp(r.f);
     return c != 0 ? c : s.opCmp(r.s);
   }
}

yields

Error: no property 'opCmp' for type 'const(int)'

if one tries to compare pairs of int.
Jun 09
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/9/17 10:53 AM, Honey wrote:
 Hi everyone!

 Given

 struct Pair(T, U = T)
 {
   T f;
   U s;
 }

 what is the intended way to genrically implement opCmp for this struct?

 The naive approach

 struct Pair(T, U = T)
 {
   // [...]

   int opCmp(const Pair r) const
   {
     immutable c = f.opCmp(r.f);
     return c != 0 ? c : s.opCmp(r.s);
   }
 }

 yields

 Error: no property 'opCmp' for type 'const(int)'

 if one tries to compare pairs of int.
TypeInfo can provide the comparison for you, but it's a little ugly. If we take a look at std.algorithm.comparison.cmp, it uses operators to compare two values (i.e. < first, then >, otherwise return 0). However, this is less performant if the type defines opCmp (you are calling it twice). There really ought to be an opCmp for any type, which does the best thing available. I'm not sure if it exists. If I were to write it, it would be something like: int doCmp(T)(auto ref T t1, auto ref T t2) { static if(is(typeof(t1.opCmp(t2)))) return t1.opCmp(t2); else { if(t1 < t2) return -1; else if(t1 > t2) return 1; return 0; } } Then your function would work, just use doCmp instead of opCmp. Note that D already (I think) does by default a member-wise comparison, in order of declaration. So if that's all you really need, I don't think you need to define opCmp at all. -Steve
Jun 09
next sibling parent reply Honey <honey pot.com> writes:
Hi Steve!

On Friday, 9 June 2017 at 15:12:42 UTC, Steven Schveighoffer 
wrote:
 TypeInfo can provide the comparison for you, but it's a little 
 ugly.

 If we take a look at std.algorithm.comparison.cmp, it uses 
 operators to compare two values (i.e. < first, then >, 
 otherwise return 0). However, this is less performant if the 
 type defines opCmp (you are calling it twice).
Calling opCmp twice on the same data is exactly what I tried to avoid.
 There really ought to be an opCmp for any type, which does the 
 best thing available. I'm not sure if it exists.
I agree it should exist.
 If I were to write it, it would be something like:

 int doCmp(T)(auto ref T t1, auto ref T t2)
 {
    static if(is(typeof(t1.opCmp(t2))))
 	return t1.opCmp(t2);
    else
    {
       if(t1 < t2) return -1;
       else if(t1 > t2) return 1;
       return 0;
    }
 }

 Then your function would work, just use doCmp instead of opCmp.
Thanks. That's working. Do you know whether this will always generate optimally efficient code (say, T is int and there is hardware support for three way comparison)?
 Note that D already (I think) does by default a member-wise 
 comparison, in order of declaration. So if that's all you 
 really need, I don't think you need to define opCmp at all.
Checked that: Error: need member function opCmp() for struct Pair!(int, int) to compare
Jun 09
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/9/17 11:33 AM, Honey wrote:
 Hi Steve!

 On Friday, 9 June 2017 at 15:12:42 UTC, Steven Schveighoffer wrote:
 If I were to write it, it would be something like:

 int doCmp(T)(auto ref T t1, auto ref T t2)
 {
    static if(is(typeof(t1.opCmp(t2))))
     return t1.opCmp(t2);
    else
    {
       if(t1 < t2) return -1;
       else if(t1 > t2) return 1;
       return 0;
    }
 }

 Then your function would work, just use doCmp instead of opCmp.
Thanks. That's working. Do you know whether this will always generate optimally efficient code (say, T is int and there is hardware support for three way comparison)?
No, I don't know. I wrote it in about 10 seconds on the forum :) I'm sure there's better ways to compare integers than that. This is why I think such a function should exist in Phobos/druntime. I'm not 100% sure it doesn't already, but I couldn't find one...
 Note that D already (I think) does by default a member-wise
 comparison, in order of declaration. So if that's all you really need,
 I don't think you need to define opCmp at all.
Checked that: Error: need member function opCmp() for struct Pair!(int, int) to compare
OK, maybe it's just equality and hashing then. Sorry for the noise. -Steve
Jun 09
parent reply Honey <honey pot.com> writes:
On Friday, 9 June 2017 at 17:28:18 UTC, Steven Schveighoffer 
wrote:
 This is why I think such a function should exist in 
 Phobos/druntime. I'm not 100% sure it doesn't already, but I 
 couldn't find one...
Looking at the implementation of Tuple.opCmp, I'm not sure I'd bet on existence of opCmp for fundamental types: int opCmp(R)(R rhs) if (areCompatibleTuples!(typeof(this), R, "<")) { foreach (i, Unused; Types) { if (field[i] != rhs.field[i]) { return field[i] < rhs.field[i] ? -1 : 1; } } return 0; }
Jun 09
parent reply Honey <honey pot.com> writes:
On Friday, 9 June 2017 at 17:50:28 UTC, Honey wrote:
 Looking at the implementation of Tuple.opCmp, I'm not sure I'd 
 bet on existence of opCmp for fundamental types:

         int opCmp(R)(R rhs)
         if (areCompatibleTuples!(typeof(this), R, "<"))
         {
             foreach (i, Unused; Types)
             {
                 if (field[i] != rhs.field[i])
                 {
                     return field[i] < rhs.field[i] ? -1 : 1;
                 }
             }
             return 0;
         }
It turned out that there is a standard three way comparison function for ranges and strings [1]. Doesn't it make sense to introduce another overload of cmp similar to Steve's doCmp [2] right at that spot? This would simplify the implementation of opCmp for aggregates and can also lead to a moderate performance benefit [3]. (Note that [3] does not provide an additional overload of cmp but uses a less elegant approach with the same performance characteristics.) // $ ldc2 -O3 -release -boundscheck=off cmpTuple.d // // real 0m18.787s // user 0m18.784s // sys 0m0.003s // // // $ ldc2 -O3 -release -boundscheck=off cmpTuple.d -d-version=singlePassCmp // $ time ./cmpTuple // // real 0m16.109s // user 0m16.102s // sys 0m0.007s [1] https://dlang.org/phobos/std_algorithm_comparison.html#cmp [2] http://forum.dlang.org/post/ohedtb$1phe$1 digitalmars.com [3] https://dpaste.dzfl.pl/7cbfa2e1965b
Jun 11
next sibling parent reply Honey <honey pot.com> writes:
On Sunday, 11 June 2017 at 15:24:30 UTC, Honey wrote:
 Doesn't it make sense to introduce another overload of cmp 
 similar to Steve's doCmp [2] right at that spot?
Moreover, it seems that std.algorithm.cmp should employ three way comparison as well. The current implementation int cmp(alias pred = "a < b", R1, R2)(R1 r1, R2 r2) if (isInputRange!R1 && isInputRange!R2 && !(isSomeString!R1 && isSomeString!R2)) { for (;; r1.popFront(), r2.popFront()) { if (r1.empty) return -cast(int)!r2.empty; if (r2.empty) return !r1.empty; auto a = r1.front, b = r2.front; if (binaryFun!pred(a, b)) return -1; if (binaryFun!pred(b, a)) return 1; } } does not seem to be great.
Jun 11
parent Honey <honey pot.com> writes:
On Sunday, 11 June 2017 at 15:40:42 UTC, Honey wrote:
 On Sunday, 11 June 2017 at 15:24:30 UTC, Honey wrote:
 Doesn't it make sense to introduce another overload of cmp 
 similar to Steve's doCmp [2] right at that spot?
Moreover, it seems that std.algorithm.cmp should employ three way comparison as well.
Forget about it. I think a different approach is required, here.
Jun 11
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/11/17 11:24 AM, Honey wrote:
 On Friday, 9 June 2017 at 17:50:28 UTC, Honey wrote:
 Looking at the implementation of Tuple.opCmp, I'm not sure I'd bet on
 existence of opCmp for fundamental types:

         int opCmp(R)(R rhs)
         if (areCompatibleTuples!(typeof(this), R, "<"))
         {
             foreach (i, Unused; Types)
             {
                 if (field[i] != rhs.field[i])
                 {
                     return field[i] < rhs.field[i] ? -1 : 1;
                 }
             }
             return 0;
         }
It turned out that there is a standard three way comparison function for ranges and strings [1].
Yes, I saw that when I was looking (you can see from my reply that you quoted below).
 Doesn't it make sense to introduce another overload of cmp similar to
 Steve's doCmp [2] right at that spot?
Yes I think it makes sense to have such a comparison function for non-ranges. Yes it probably belongs there, as there are other functions (e.g. swap) that are not specific to ranges in std.algorithm. It should probably not be called cmp, as it will be a default comparison (with the default ordering), although if we did something like: int cmp(alias pred = "a < b", T)(T t1, T t2) if(!isInputRange!T) { static if(pred == "a < b") { /* do default optimized way */ } else { /* more generic mechanism using pred */ } } it might be nice. Need to think about the API ramifications.
 This would simplify the implementation of opCmp for aggregates and can
 also lead to a moderate performance benefit [3]. (Note that [3] does not
 provide an additional overload of cmp but uses a less elegant approach
 with the same performance characteristics.)
I agree. It's a thing also that can be optimized in unintuitive ways. I think Andrei has a nice way to do opCmp for integers that's a simple subtraction and negation or something like that. -Steve
Jun 13
next sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Tue, Jun 13, 2017 at 10:51:40AM -0400, Steven Schveighoffer via
Digitalmars-d-learn wrote:
[...]
 I think Andrei has a nice way to do opCmp for integers that's a simple
 subtraction and negation or something like that.
[...] In theory, cmp(int x, int y) can be implemented simply as (x - y). However, this fails when integer overflow occurs. Does Andrei have a nice way of doing this that isn't vulnerable to integer overflow? T -- You are only young once, but you can stay immature indefinitely. -- azephrahel
Jun 13
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 13 June 2017 at 16:49:14 UTC, H. S. Teoh wrote:
 On Tue, Jun 13, 2017 at 10:51:40AM -0400, Steven Schveighoffer 
 via Digitalmars-d-learn wrote: [...]
 I think Andrei has a nice way to do opCmp for integers that's 
 a simple subtraction and negation or something like that.
[...] In theory, cmp(int x, int y) can be implemented simply as (x - y). However, this fails when integer overflow occurs. Does Andrei have a nice way of doing this that isn't vulnerable to integer overflow?
return (x > y) - (x < y); According to this stackoverflow question it looks like a good candidate https://stackoverflow.com/questions/10996418/efficient-integer-compare-function
Jun 13
prev sibling parent Honey <honey pot.com> writes:
On Tuesday, 13 June 2017 at 14:51:40 UTC, Steven Schveighoffer 
wrote:
 Yes, I saw that when I was looking (you can see from my reply 
 that you quoted below).
Yes, I had missed that point.
 Yes I think it makes sense to have such a comparison function 
 for non-ranges. Yes it probably belongs there, as there are 
 other functions (e.g. swap) that are not specific to ranges in 
 std.algorithm. It should probably not be called cmp, as it will 
 be a default comparison (with the default ordering), although 
 if we did something like:

 int cmp(alias pred = "a < b", T)(T t1, T t2) if(!isInputRange!T)
 {
    static if(pred == "a < b") { /* do default optimized way */ }
    else { /* more generic mechanism using pred */ }
 }

 it might be nice. Need to think about the API ramifications.
I retracted my earlier proposal after I had realized my confusion. I had thought that cmp would implement three way range comparison based on three way element comparison. Then I realized that it is based on "a < b" or alike. The latter is certainly useful but I am afraid that this approach does not always lead to optimal performance. I gathered a few ideas about the subject. I have to sit down and write it down.
 I agree. It's a thing also that can be optimized in unintuitive 
 ways. I think Andrei has a nice way to do opCmp for integers 
 that's a simple subtraction and negation or something like that.
I observed that compiler optimizers are pretty smart about comparison of individual integers. I guess that we do not need to be clever, here.
Jun 13
prev sibling parent reply drug <drug2004 bk.ru> writes:
09.06.2017 18:12, Steven Schveighoffer пишет:
 int doCmp(T)(auto ref T t1, auto ref T t2)
 {
     static if(is(typeof(t1.opCmp(t2))))
      return t1.opCmp(t2);
     else
     {
        if(t1 < t2) return -1;
        else if(t1 > t2) return 1;
        return 0;
     }
 }
Isn't it enough to use just '<' et al? int opCmp(const Pair r) const { if (f < r.f) return -1; if (f > r.f) return 1; if (s < r.s) return -1; if (s > r.s) return 1; return 0; } for floating types it's not completely right of course but nevertheless
Jun 09
parent drug <drug2004 bk.ru> writes:
I re-read thoroughly and got it)
Jun 09