digitalmars.D.learn - How to implement opCmp?

Honey (22/22) Jun 09 2017 Hi everyone!

Steven Schveighoffer (25/46) Jun 09 2017 TypeInfo can provide the comparison for you, but it's a little ugly.

Honey (13/37) Jun 09 2017 Hi Steve!

Steven Schveighoffer (7/32) Jun 09 2017 No, I don't know. I wrote it in about 10 seconds on the forum :) I'm

Honey (16/19) Jun 09 2017 Looking at the implementation of Tuple.opCmp, I'm not sure I'd

Honey (27/41) Jun 11 2017 It turned out that there is a standard three way comparison

Honey (17/19) Jun 11 2017 Moreover, it seems that std.algorithm.cmp should employ three way

Honey (2/7) Jun 11 2017 Forget about it. I think a different approach is required, here.

Steven Schveighoffer (18/42) Jun 13 2017 Yes, I saw that when I was looking (you can see from my reply that you

H. S. Teoh via Digitalmars-d-learn (9/11) Jun 13 2017 [...]

Patrick Schluter (5/16) Jun 13 2017 return (x > y) - (x < y);

Honey (14/31) Jun 13 2017 Yes, I had missed that point.

drug (15/26) Jun 09 2017 Isn't it enough to use just '<' et al?

drug (1/1) Jun 09 2017 I re-read thoroughly and got it)

Honey <honey pot.com> writes:

Hi everyone!

Given

struct Pair(T, U = T)
{
   T f;
   U s;
}

what is the intended way to genrically implement opCmp for this 
struct?

The naive approach

struct Pair(T, U = T)
{
   // [...]

   int opCmp(const Pair r) const
   {
     immutable c = f.opCmp(r.f);
     return c != 0 ? c : s.opCmp(r.s);
   }
}

yields

Error: no property 'opCmp' for type 'const(int)'

if one tries to compare pairs of int.

Jun 09 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/9/17 10:53 AM, Honey wrote:
 Hi everyone!

 Given

 struct Pair(T, U = T)
 {
   T f;
   U s;
 }

 what is the intended way to genrically implement opCmp for this struct?

 The naive approach

 struct Pair(T, U = T)
 {
   // [...]

   int opCmp(const Pair r) const
   {
     immutable c = f.opCmp(r.f);
     return c != 0 ? c : s.opCmp(r.s);
   }
 }

 yields

 Error: no property 'opCmp' for type 'const(int)'

 if one tries to compare pairs of int.

TypeInfo can provide the comparison for you, but it's a little ugly.

If we take a look at std.algorithm.comparison.cmp, it uses operators to 
compare two values (i.e. < first, then >, otherwise return 0). However, 
this is less performant if the type defines opCmp (you are calling it 
twice).

There really ought to be an opCmp for any type, which does the best 
thing available. I'm not sure if it exists.

If I were to write it, it would be something like:

int doCmp(T)(auto ref T t1, auto ref T t2)
{
    static if(is(typeof(t1.opCmp(t2))))
	return t1.opCmp(t2);
    else
    {
       if(t1 < t2) return -1;
       else if(t1 > t2) return 1;
       return 0;
    }
}

Then your function would work, just use doCmp instead of opCmp.

Note that D already (I think) does by default a member-wise comparison, 
in order of declaration. So if that's all you really need, I don't think 
you need to define opCmp at all.

-Steve

Jun 09 2017

Honey <honey pot.com> writes:

Hi Steve!

On Friday, 9 June 2017 at 15:12:42 UTC, Steven Schveighoffer 
wrote:
 TypeInfo can provide the comparison for you, but it's a little 
 ugly.

 If we take a look at std.algorithm.comparison.cmp, it uses 
 operators to compare two values (i.e. < first, then >, 
 otherwise return 0). However, this is less performant if the 
 type defines opCmp (you are calling it twice).

Calling opCmp twice on the same data is exactly what I tried to 
avoid.


 There really ought to be an opCmp for any type, which does the 
 best thing available. I'm not sure if it exists.

I agree it should exist.


 If I were to write it, it would be something like:

 int doCmp(T)(auto ref T t1, auto ref T t2)
 {
    static if(is(typeof(t1.opCmp(t2))))
 	return t1.opCmp(t2);
    else
    {
       if(t1 < t2) return -1;
       else if(t1 > t2) return 1;
       return 0;
    }
 }

 Then your function would work, just use doCmp instead of opCmp.

Thanks. That's working.

Do you know whether this will always generate optimally efficient 
code (say, T is int and there is hardware support for three way 
comparison)?


 Note that D already (I think) does by default a member-wise 
 comparison, in order of declaration. So if that's all you 
 really need, I don't think you need to define opCmp at all.

Checked that:

Error: need member function opCmp() for struct Pair!(int, int) to 
compare

Jun 09 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/9/17 11:33 AM, Honey wrote:
 Hi Steve!

 On Friday, 9 June 2017 at 15:12:42 UTC, Steven Schveighoffer wrote:
 If I were to write it, it would be something like:

 int doCmp(T)(auto ref T t1, auto ref T t2)
 {
    static if(is(typeof(t1.opCmp(t2))))
     return t1.opCmp(t2);
    else
    {
       if(t1 < t2) return -1;
       else if(t1 > t2) return 1;
       return 0;
    }
 }

 Then your function would work, just use doCmp instead of opCmp.

 Thanks. That's working.

 Do you know whether this will always generate optimally efficient code
 (say, T is int and there is hardware support for three way comparison)?

No, I don't know. I wrote it in about 10 seconds on the forum :) I'm 
sure there's better ways to compare integers than that.

This is why I think such a function should exist in Phobos/druntime. I'm 
not 100% sure it doesn't already, but I couldn't find one...

 Note that D already (I think) does by default a member-wise
 comparison, in order of declaration. So if that's all you really need,
 I don't think you need to define opCmp at all.

 Checked that:

 Error: need member function opCmp() for struct Pair!(int, int) to compare

OK, maybe it's just equality and hashing then. Sorry for the noise.

-Steve

Jun 09 2017

Honey <honey pot.com> writes:

On Friday, 9 June 2017 at 17:28:18 UTC, Steven Schveighoffer 
wrote:
 This is why I think such a function should exist in 
 Phobos/druntime. I'm not 100% sure it doesn't already, but I 
 couldn't find one...

Looking at the implementation of Tuple.opCmp, I'm not sure I'd 
bet on existence of opCmp for fundamental types:

         int opCmp(R)(R rhs)
         if (areCompatibleTuples!(typeof(this), R, "<"))
         {
             foreach (i, Unused; Types)
             {
                 if (field[i] != rhs.field[i])
                 {
                     return field[i] < rhs.field[i] ? -1 : 1;
                 }
             }
             return 0;
         }

Jun 09 2017

Honey <honey pot.com> writes:

On Friday, 9 June 2017 at 17:50:28 UTC, Honey wrote:
 Looking at the implementation of Tuple.opCmp, I'm not sure I'd 
 bet on existence of opCmp for fundamental types:

         int opCmp(R)(R rhs)
         if (areCompatibleTuples!(typeof(this), R, "<"))
         {
             foreach (i, Unused; Types)
             {
                 if (field[i] != rhs.field[i])
                 {
                     return field[i] < rhs.field[i] ? -1 : 1;
                 }
             }
             return 0;
         }

It turned out that there is a standard three way comparison 
function for ranges and strings [1].

Doesn't it make sense to introduce another overload of cmp 
similar to Steve's doCmp [2] right at that spot?

This would simplify the implementation of opCmp for aggregates 
and can also lead to a moderate performance benefit [3]. (Note 
that [3] does not provide an additional overload of cmp but uses 
a less elegant approach with the same performance 
characteristics.)

// $ ldc2 -O3 -release -boundscheck=off cmpTuple.d
//
// real	0m18.787s
// user	0m18.784s
// sys	0m0.003s
//
//
// $ ldc2 -O3 -release -boundscheck=off cmpTuple.d 
-d-version=singlePassCmp
// $ time ./cmpTuple
//
// real	0m16.109s
// user	0m16.102s
// sys	0m0.007s

[1] https://dlang.org/phobos/std_algorithm_comparison.html#cmp
[2] http://forum.dlang.org/post/ohedtb$1phe$1 digitalmars.com
[3] https://dpaste.dzfl.pl/7cbfa2e1965b

Jun 11 2017

Honey <honey pot.com> writes:

On Sunday, 11 June 2017 at 15:24:30 UTC, Honey wrote:
 Doesn't it make sense to introduce another overload of cmp 
 similar to Steve's doCmp [2] right at that spot?

Moreover, it seems that std.algorithm.cmp should employ three way 
comparison as well. The current implementation

int cmp(alias pred = "a < b", R1, R2)(R1 r1, R2 r2)
if (isInputRange!R1 && isInputRange!R2 && !(isSomeString!R1 && 
isSomeString!R2))
{
     for (;; r1.popFront(), r2.popFront())
     {
         if (r1.empty) return -cast(int)!r2.empty;
         if (r2.empty) return !r1.empty;
         auto a = r1.front, b = r2.front;
         if (binaryFun!pred(a, b)) return -1;
         if (binaryFun!pred(b, a)) return 1;
     }
}

does not seem to be great.

Jun 11 2017

Honey <honey pot.com> writes:

On Sunday, 11 June 2017 at 15:40:42 UTC, Honey wrote:
 On Sunday, 11 June 2017 at 15:24:30 UTC, Honey wrote:
 Doesn't it make sense to introduce another overload of cmp 
 similar to Steve's doCmp [2] right at that spot?

 Moreover, it seems that std.algorithm.cmp should employ three 
 way comparison as well.

Forget about it. I think a different approach is required, here.

Jun 11 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/11/17 11:24 AM, Honey wrote:
 On Friday, 9 June 2017 at 17:50:28 UTC, Honey wrote:
 Looking at the implementation of Tuple.opCmp, I'm not sure I'd bet on
 existence of opCmp for fundamental types:

         int opCmp(R)(R rhs)
         if (areCompatibleTuples!(typeof(this), R, "<"))
         {
             foreach (i, Unused; Types)
             {
                 if (field[i] != rhs.field[i])
                 {
                     return field[i] < rhs.field[i] ? -1 : 1;
                 }
             }
             return 0;
         }

 It turned out that there is a standard three way comparison function for
 ranges and strings [1].

Yes, I saw that when I was looking (you can see from my reply that you 
quoted below).

 Doesn't it make sense to introduce another overload of cmp similar to
 Steve's doCmp [2] right at that spot?

Yes I think it makes sense to have such a comparison function for 
non-ranges. Yes it probably belongs there, as there are other functions 
(e.g. swap) that are not specific to ranges in std.algorithm. It should 
probably not be called cmp, as it will be a default comparison (with the 
default ordering), although if we did something like:

int cmp(alias pred = "a < b", T)(T t1, T t2) if(!isInputRange!T)
{
    static if(pred == "a < b") { /* do default optimized way */ }
    else { /* more generic mechanism using pred */ }
}

it might be nice. Need to think about the API ramifications.

 This would simplify the implementation of opCmp for aggregates and can
 also lead to a moderate performance benefit [3]. (Note that [3] does not
 provide an additional overload of cmp but uses a less elegant approach
 with the same performance characteristics.)

I agree. It's a thing also that can be optimized in unintuitive ways. I 
think Andrei has a nice way to do opCmp for integers that's a simple 
subtraction and negation or something like that.

-Steve

Jun 13 2017

"H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:

On Tue, Jun 13, 2017 at 10:51:40AM -0400, Steven Schveighoffer via
Digitalmars-d-learn wrote:
[...]
 I think Andrei has a nice way to do opCmp for integers that's a simple
 subtraction and negation or something like that.

[...]

In theory, cmp(int x, int y) can be implemented simply as (x - y).
However, this fails when integer overflow occurs.  Does Andrei have a
nice way of doing this that isn't vulnerable to integer overflow?


T

-- 
You are only young once, but you can stay immature indefinitely. -- azephrahel

Jun 13 2017

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Tuesday, 13 June 2017 at 16:49:14 UTC, H. S. Teoh wrote:
 On Tue, Jun 13, 2017 at 10:51:40AM -0400, Steven Schveighoffer 
 via Digitalmars-d-learn wrote: [...]
 I think Andrei has a nice way to do opCmp for integers that's 
 a simple subtraction and negation or something like that.

 [...]

 In theory, cmp(int x, int y) can be implemented simply as (x - 
 y).
 However, this fails when integer overflow occurs.  Does Andrei 
 have a
 nice way of doing this that isn't vulnerable to integer 
 overflow?

return (x > y) - (x < y);

According to this stackoverflow question it looks like a good 
candidate
https://stackoverflow.com/questions/10996418/efficient-integer-compare-function

Jun 13 2017

Honey <honey pot.com> writes:

On Tuesday, 13 June 2017 at 14:51:40 UTC, Steven Schveighoffer 
wrote:
 Yes, I saw that when I was looking (you can see from my reply 
 that you quoted below).

Yes, I had missed that point.


 Yes I think it makes sense to have such a comparison function 
 for non-ranges. Yes it probably belongs there, as there are 
 other functions (e.g. swap) that are not specific to ranges in 
 std.algorithm. It should probably not be called cmp, as it will 
 be a default comparison (with the default ordering), although 
 if we did something like:

 int cmp(alias pred = "a < b", T)(T t1, T t2) if(!isInputRange!T)
 {
    static if(pred == "a < b") { /* do default optimized way */ }
    else { /* more generic mechanism using pred */ }
 }

 it might be nice. Need to think about the API ramifications.

I retracted my earlier proposal after I had realized my 
confusion. I had thought that cmp would implement three way range 
comparison based on three way element comparison. Then I realized 
that it is based on "a < b" or alike. The latter is certainly 
useful but I am afraid that this approach does not always lead to 
optimal performance.

I gathered a few ideas about the subject. I have to sit down and 
write it down.


 I agree. It's a thing also that can be optimized in unintuitive 
 ways. I think Andrei has a nice way to do opCmp for integers 
 that's a simple subtraction and negation or something like that.

I observed that compiler optimizers are pretty smart about 
comparison of individual integers. I guess that we do not need to 
be clever, here.

Jun 13 2017

drug <drug2004 bk.ru> writes:

09.06.2017 18:12, Steven Schveighoffer пишет:
 int doCmp(T)(auto ref T t1, auto ref T t2)
 {
     static if(is(typeof(t1.opCmp(t2))))
      return t1.opCmp(t2);
     else
     {
        if(t1 < t2) return -1;
        else if(t1 > t2) return 1;
        return 0;
     }
 }

Isn't it enough to use just '<' et al?
int opCmp(const Pair r) const
{
     if (f < r.f)
	return -1;
     if (f > r.f)
	return 1;

     if (s < r.s)
	return -1;
     if (s > r.s)
	return 1;

     return 0;
}
for floating types it's not completely right of course but nevertheless

Jun 09 2017

drug <drug2004 bk.ru> writes:

I re-read thoroughly and got it)

Jun 09 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - How to implement opCmp?