www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - WAT: opCmp and opEquals woes

reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
This morning, I discovered this major WAT in D:

----
struct S {
        int x;
        int y;
        int opCmp(S s) {
                return x - s.x; // compare only x
        }
}

void main() {
        auto s1 = S(1,2);
        auto s2 = S(1,3);
        auto s3 = S(2,1);

        assert(s1 < s3); // OK
        assert(s2 < s3); // OK
        assert(s3 > s1); // OK
        assert(s3 > s2); // OK
        assert(s1 <= s2 && s2 >= s1); // OK
        assert(s1 == s2); // FAIL -- WAT??
}
----

The reason for this is that the <, <=, >=, > operators are defined in
terms of opCmp (which, btw, is defined to return 0 when the objects
being compared are equal), but == is defined in terms of opEquals. When
opEquals is not defined, it defaults to the built-in compiler
definition, which is a membership equality test, even if opCmp *is*
defined, and returns 0 when the objects are equal.

Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
says this is the case (unfortunately I'm at work so I can't check my
copy of TDPL).

https://issues.dlang.org/show_bug.cgi?id=13179

:-(


T

-- 
English has the lovely word "defenestrate", meaning "to execute by throwing
someone out a window", or more recently "to remove Windows from a computer and
replace it with something useful". :-) -- John Cowan
Jul 23 2014
next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 7/23/14, 1:45 PM, H. S. Teoh via Digitalmars-d wrote:
 This morning, I discovered this major WAT in D:

 ----
 struct S {
          int x;
          int y;
          int opCmp(S s) {
                  return x - s.x; // compare only x
          }
 }

 void main() {
          auto s1 = S(1,2);
          auto s2 = S(1,3);
          auto s3 = S(2,1);

          assert(s1 < s3); // OK
          assert(s2 < s3); // OK
          assert(s3 > s1); // OK
          assert(s3 > s2); // OK
          assert(s1 <= s2 && s2 >= s1); // OK
          assert(s1 == s2); // FAIL -- WAT??
 }
 ----

 The reason for this is that the <, <=, >=, > operators are defined in
 terms of opCmp (which, btw, is defined to return 0 when the objects
 being compared are equal), but == is defined in terms of opEquals. When
 opEquals is not defined, it defaults to the built-in compiler
 definition, which is a membership equality test, even if opCmp *is*
 defined, and returns 0 when the objects are equal.

 Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
 says this is the case (unfortunately I'm at work so I can't check my
 copy of TDPL).

 https://issues.dlang.org/show_bug.cgi?id=13179

 :-(


 T
Imagine you have a list of integers and strings denoting integers: [1, "2", 100, "38"]. Now you want to sort them according to their numeric value. Of course, 1 and "1" would have the same order. However, 1 and "1" are different, so "==" would give false, while 1.opCmp("1") would give 0. Equality and comparison are different. opCmp is used for sorting objects, which has nothing to do with equality. Inferring equality from opCmp is wrong in my opinion.
Jul 23 2014
parent reply "Dicebot" <public dicebot.lv> writes:
On Wednesday, 23 July 2014 at 17:15:12 UTC, Ary Borenszweig wrote:
 Imagine you have a list of integers and strings denoting 
 integers: [1, "2", 100, "38"]. Now you want to sort them 
 according to their numeric value. Of course, 1 and "1" would 
 have the same order. However, 1 and "1" are different, so "==" 
 would give false, while 1.opCmp("1") would give 0.

 Equality and comparison are different. opCmp is used for 
 sorting objects, which has nothing to do with equality. 
 Inferring equality from opCmp is wrong in my opinion.
Well this is why you can actually override those :) I think automatic opCmd -> opEqual generation covers vast majority of use cases and as such will have a vary good effort / decreased annoyance ratio.
Jul 23 2014
next sibling parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 7/23/14, 11:09 AM, Dicebot wrote:
 On Wednesday, 23 July 2014 at 17:15:12 UTC, Ary Borenszweig wrote:
 Imagine you have a list of integers and strings denoting integers: [1,
 "2", 100, "38"]. Now you want to sort them according to their numeric
 value. Of course, 1 and "1" would have the same order. However, 1 and
 "1" are different, so "==" would give false, while 1.opCmp("1") would
 give 0.

 Equality and comparison are different. opCmp is used for sorting
 objects, which has nothing to do with equality. Inferring equality
 from opCmp is wrong in my opinion.
Well this is why you can actually override those :) I think automatic opCmd -> opEqual generation covers vast majority of use cases and as such will have a vary good effort / decreased annoyance ratio.
I agree. In fact I think if you've implemented opCmp to sort 1 and "1" as equal that in most cases you'd expect "1" and 1 to compare as logically equal. Automatic opCmp -> opEquals seems like a very sane default to me.
Jul 23 2014
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/23/2014 11:26 AM, David Gileadi wrote:
 On 7/23/14, 11:09 AM, Dicebot wrote:
 On Wednesday, 23 July 2014 at 17:15:12 UTC, Ary Borenszweig wrote:
 Imagine you have a list of integers and strings denoting integers: [1,
 "2", 100, "38"]. Now you want to sort them according to their numeric
 value. Of course, 1 and "1" would have the same order. However, 1 and
 "1" are different, so "==" would give false, while 1.opCmp("1") would
 give 0.

 Equality and comparison are different. opCmp is used for sorting
 objects, which has nothing to do with equality. Inferring equality
 from opCmp is wrong in my opinion.
Well this is why you can actually override those :) I think automatic opCmd -> opEqual generation covers vast majority of use cases and as such will have a vary good effort / decreased annoyance ratio.
I agree. In fact I think if you've implemented opCmp to sort 1 and "1" as equal that in most cases you'd expect "1" and 1 to compare as logically equal. Automatic opCmp -> opEquals seems like a very sane default to me.
To add, C++ is getting "= default" versions: http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3950.html Ali
Jul 23 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 11:09 AM, Dicebot wrote:
 On Wednesday, 23 July 2014 at 17:15:12 UTC, Ary Borenszweig wrote:
 Imagine you have a list of integers and strings denoting integers: [1,
 "2", 100, "38"]. Now you want to sort them according to their numeric
 value. Of course, 1 and "1" would have the same order. However, 1 and
 "1" are different, so "==" would give false, while 1.opCmp("1") would
 give 0.

 Equality and comparison are different. opCmp is used for sorting
 objects, which has nothing to do with equality. Inferring equality
 from opCmp is wrong in my opinion.
Well this is why you can actually override those :) I think automatic opCmd -> opEqual generation covers vast majority of use cases and as such will have a vary good effort / decreased annoyance ratio.
I'd say let's leave things as they are. opEquals may need to do less work than opCmp, and it often sees intensive use. -- Andrei
Jul 23 2014
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 23 July 2014 at 18:49:49 UTC, Andrei Alexandrescu 
wrote:
 I'd say let's leave things as they are. opEquals may need to do 
 less work than opCmp, and it often sees intensive use. -- Andrei
You will be the one answering user complaints ;)
Jul 23 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jul 23, 2014 at 11:49:58AM -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 7/23/14, 11:09 AM, Dicebot wrote:
On Wednesday, 23 July 2014 at 17:15:12 UTC, Ary Borenszweig wrote:
Imagine you have a list of integers and strings denoting integers:
[1, "2", 100, "38"]. Now you want to sort them according to their
numeric value. Of course, 1 and "1" would have the same order.
However, 1 and "1" are different, so "==" would give false, while
1.opCmp("1") would give 0.

Equality and comparison are different. opCmp is used for sorting
objects, which has nothing to do with equality. Inferring equality
from opCmp is wrong in my opinion.
Well this is why you can actually override those :) I think automatic opCmd -> opEqual generation covers vast majority of use cases and as such will have a vary good effort / decreased annoyance ratio.
I'd say let's leave things as they are. opEquals may need to do less work than opCmp, and it often sees intensive use. -- Andrei
If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much rather say it should be a compile error if the user defines opCmp but not opEquals. Currently, we have the bad situation where == behaves inconsistently w.r.t. <, <=, >=, > because we allow opCmp to be defined but opEquals not, and the default compiler implementation of opEquals may or may not match the meaning of opCmp. In short, what I'd like to see, in order of preference, is: (1) If opCmp is defined but opEquals not, then opEquals should be defined to be opCmp()==0. (2) If (1) is a no-go, then the next best situation is that if opCmp is defined but opEquals isn't, then the compiler should issue an error, rather than implicitly generating a default opEquals that probably does not match the programmer's expectations. (3) If (2) is also a no-go, the 3rd best situation is that if opCmp is defined but opEquals isn't, then the compiler should issue an error if the user ever writes "a==b". That is, we allow the user to not define opEquals as long as it's not actually used, but it's an error if it is used. (4) Do nothing, and allow the current hidden breakage to perpetuate and make people hate D when they suddenly discover it when they forget to implement opEquals. I really hope we don't have to resort to (4). T -- All men are mortal. Socrates is mortal. Therefore all men are Socrates.
Jul 23 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
 If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
 rather say it should be a compile error if the user defines opCmp but
 not opEquals.
No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. -- Andrei
Jul 23 2014
next sibling parent Brad Roberts via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 7/23/2014 2:36 PM, Andrei Alexandrescu via Digitalmars-d wrote:
 On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
 If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
 rather say it should be a compile error if the user defines opCmp but
 not opEquals.
No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. -- Andrei
Right, but in that case just define both. It's not the dominant case so shouldn't define the default behavior. Or if you truly have a case of comparable but no possibility of equal, just disable opEqual (or define and throw, or assert, or...).
Jul 23 2014
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu 
wrote:
 On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
 If autogenerating opEquals to be opCmp()==0 is a no-go, then 
 I'd much
 rather say it should be a compile error if the user defines 
 opCmp but
 not opEquals.
No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. --
I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, then it that type is broken and should not be using opCmp to do its comparisons. std.algorithm.sort allows you to use any predicate you want, allowing for such orderings, but it does not work with generic code for a type to define opCmp or opEquals such that they're not consistent, because that's not consistent with how comparisons work for the built-in types. - Jonathan M Davis
Jul 23 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 5:28 PM, Jonathan M Davis wrote:
 On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu wrote:
 On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
 If autogenerating opEquals to be opCmp()==0 is a no-go, then I'd much
 rather say it should be a compile error if the user defines opCmp but
 not opEquals.
No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. --
I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, then it that type is broken and should not be using opCmp to do its comparisons. std.algorithm.sort allows you to use any predicate you want, allowing for such orderings, but it does not work with generic code for a type to define opCmp or opEquals such that they're not consistent, because that's not consistent with how comparisons work for the built-in types.
std.algorithm.sort does not use equality at all. It just deems objects for which pred(a, b) and pred(b, a) as unordered. -- Andrei
Jul 23 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 24 July 2014 at 00:45:05 UTC, Andrei Alexandrescu 
wrote:
 std.algorithm.sort does not use equality at all. It just deems 
 objects for which pred(a, b) and pred(b, a) as unordered. --
I know. It uses < by default. What I'm disputing is that it's okay to define a type whose opCmp does not match its opEquals - or even that it has opCmp but doesn't have opEquals. I would strongly argue that such a type should not use opCmp for its ordering, because otherwise, its comparison operators are not consistent with how the comparison operators work for the built-in types. I brought up sort, because that's usually where the question of ordering comes up, and our sort function is designed so that it does not require opCmp (the same with other Phobos constructs such as RedBlackTree). So, if a type needs to do its ordering in a manner which is inconsistent with opEquals, it can do so by providing a function other than opCmp for those comparisons. But I think that it's a poor idea to have opCmp not be consistent with opEquals, since most generic code will assume that they're consistent. I'd strongly argue that an overloaded operate should be consistent with how the built-in operators work, or that functionality should not be using an overloaded operator. This is especially important when generic code comes into play, but it's also very important with regards to understanding how code works in general. Operators bring with them certain expectations of behavior, and they should maintain that and not be used in a manner which violates that. And having opCmp be inconsistent with opEquals violates that, indicating that opCmp should not be used in that case. As such, I don't think that arguing that opCmp should be able to exist without an opEquals is a bad argument. I think that if you have opCmp, you should required to have opEquals (though not automatically defined as lhs.opCmp(rhs) == 0, because that would incur a silent performance hit). - Jonathan M Davis
Jul 23 2014
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 24 July 2014 at 00:28:06 UTC, Jonathan M Davis wrote:
 On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu 
 wrote:
 On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
 If autogenerating opEquals to be opCmp()==0 is a no-go, then 
 I'd much
 rather say it should be a compile error if the user defines 
 opCmp but
 not opEquals.
No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. --
I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, then it that type is broken and should not be using opCmp to do its comparisons. std.algorithm.sort allows you to use any predicate you want, allowing for such orderings, but it does not work with generic code for a type to define opCmp or opEquals such that they're not consistent, because that's not consistent with how comparisons work for the built-in types. - Jonathan M Davis
floating point ?
Jul 23 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 24 July 2014 at 01:37:01 UTC, deadalnix wrote:
 On Thursday, 24 July 2014 at 00:28:06 UTC, Jonathan M Davis 
 wrote:
 On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei 
 Alexandrescu wrote:
 On 7/23/14, 12:04 PM, H. S. Teoh via Digitalmars-d wrote:
 If autogenerating opEquals to be opCmp()==0 is a no-go, then 
 I'd much
 rather say it should be a compile error if the user defines 
 opCmp but
 not opEquals.
No. There is this notion of partial ordering that makes objects not smaller and not greater than others, yet not equal. --
I would strongly argue that if lhs.opCmp(rhs) == 0 is not equivalent to lhs == rhs, then it that type is broken and should not be using opCmp to do its comparisons. std.algorithm.sort allows you to use any predicate you want, allowing for such orderings, but it does not work with generic code for a type to define opCmp or opEquals such that they're not consistent, because that's not consistent with how comparisons work for the built-in types. - Jonathan M Davis
floating point ?
When it comes to equality and comparison, floating point values are mess that I would really hope no one would be looking to emulate with their own types. I grant you that they're a built in type, but they do not have clean semantics (particularly with regards to equality). IMHO, user-defined types should emulate integers with regards to how the comparison operators work. Allowing more nonsense like what FP does does not improve things. - Jonathan M Davis
Jul 24 2014
prev sibling parent "Fool" <fool dlang.org> writes:
On Wednesday, 23 July 2014 at 21:36:16 UTC, Andrei Alexandrescu 
wrote:
 There is this notion of partial ordering that makes objects not 
 smaller and not greater than others, yet not equal. -- Andrei
How do you define a partial ordering using opCmp?
Aug 16 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
 Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
 says this is the case (unfortunately I'm at work so I can't check my
 copy of TDPL).

 https://issues.dlang.org/show_bug.cgi?id=13179

 :-(
It's a good decision. There are types that are comparable for equality but not compared for ordering. -- Andrei
Jul 23 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jul 23, 2014 at 11:48:42AM -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 7/23/14, 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
says this is the case (unfortunately I'm at work so I can't check my
copy of TDPL).

https://issues.dlang.org/show_bug.cgi?id=13179

:-(
It's a good decision. There are types that are comparable for equality but not compared for ordering. -- Andrei
That's the wrong way round. I fully agree that we should not autogenerate opCmp if the user defines opEquals, since not all types comparable with equality are orderable. However, surely all orderable types are equality-comparable! Therefore, if opCmp is defined but opEquals isn't, then we should autogenerate opEquals to be the same as a.opCmp(b)==0. T -- No! I'm not in denial!
Jul 23 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 11:52 AM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, Jul 23, 2014 at 11:48:42AM -0700, Andrei Alexandrescu via
Digitalmars-d wrote:
 On 7/23/14, 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
 Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
 says this is the case (unfortunately I'm at work so I can't check my
 copy of TDPL).

 https://issues.dlang.org/show_bug.cgi?id=13179

 :-(
It's a good decision. There are types that are comparable for equality but not compared for ordering. -- Andrei
That's the wrong way round.
No.
 I fully agree that we should not
 autogenerate opCmp if the user defines opEquals, since not all types
 comparable with equality are orderable.  However, surely all orderable
 types are equality-comparable!
http://en.wikipedia.org/wiki/Lattice_(order)
 Therefore, if opCmp is defined but
 opEquals isn't, then we should autogenerate opEquals to be the same as
 a.opCmp(b)==0.
It's a sensible decision, but I'm not so sure. Andrei
Jul 23 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jul 23, 2014 at 02:35:03PM -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 7/23/14, 11:52 AM, H. S. Teoh via Digitalmars-d wrote:
[...]
I fully agree that we should not autogenerate opCmp if the user
defines opEquals, since not all types comparable with equality are
orderable.  However, surely all orderable types are
equality-comparable!
http://en.wikipedia.org/wiki/Lattice_(order)
[...] And why should this be the default behaviour? The <, <=, >=, > operators imply linear ordering, not general partial order. If you really want to implement a non-linear partial ordering, you can always define both opCmp and opEquals. This should be the *non*-default case, since in the vast majority of cases, defining opCmp means you want a linear ordering. Linear ordering should be default, and partial ordering possible if the programmer explicitly asks for it (by implementing opEquals manually). T -- Winners never quit, quitters never win. But those who never quit AND never win are idiots.
Jul 23 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 3:39 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, Jul 23, 2014 at 02:35:03PM -0700, Andrei Alexandrescu via
Digitalmars-d wrote:
 On 7/23/14, 11:52 AM, H. S. Teoh via Digitalmars-d wrote:
[...]
 I fully agree that we should not autogenerate opCmp if the user
 defines opEquals, since not all types comparable with equality are
 orderable.  However, surely all orderable types are
 equality-comparable!
http://en.wikipedia.org/wiki/Lattice_(order)
[...] And why should this be the default behaviour? The <, <=, >=, > operators imply linear ordering, not general partial order. If you really want to implement a non-linear partial ordering, you can always define both opCmp and opEquals. This should be the *non*-default case, since in the vast majority of cases, defining opCmp means you want a linear ordering. Linear ordering should be default, and partial ordering possible if the programmer explicitly asks for it (by implementing opEquals manually).
I'm unconvinced. Most algorithms that need inequality don't need equality comparison; instead, they consider objects for which both !(a < b) && !(b < a) in the same "equivalence class" that doesn't assume they are actually equal. Bottom line, inferring opEquals from opCmp seems fishy. Andrei
Jul 23 2014
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 23 July 2014 at 23:52:01 UTC, Andrei Alexandrescu
wrote:
 On 7/23/14, 3:39 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, Jul 23, 2014 at 02:35:03PM -0700, Andrei Alexandrescu 
 via Digitalmars-d wrote:
 On 7/23/14, 11:52 AM, H. S. Teoh via Digitalmars-d wrote:
[...]
 I fully agree that we should not autogenerate opCmp if the 
 user
 defines opEquals, since not all types comparable with 
 equality are
 orderable.  However, surely all orderable types are
 equality-comparable!
http://en.wikipedia.org/wiki/Lattice_(order)
[...] And why should this be the default behaviour? The <, <=, >=, > operators imply linear ordering, not general partial order. If you really want to implement a non-linear partial ordering, you can always define both opCmp and opEquals. This should be the *non*-default case, since in the vast majority of cases, defining opCmp means you want a linear ordering. Linear ordering should be default, and partial ordering possible if the programmer explicitly asks for it (by implementing opEquals manually).
I'm unconvinced. Most algorithms that need inequality don't need equality comparison; instead, they consider objects for which both !(a < b) && !(b < a) in the same "equivalence class" that doesn't assume they are actually equal. Bottom line, inferring opEquals from opCmp seems fishy. Andrei
NaN is a good example.
Jul 23 2014
prev sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 24.07.2014 01:52, schrieb Andrei Alexandrescu:
 I'm unconvinced. Most algorithms that need inequality don't need
 equality comparison; instead, they consider objects for which both !(a <
 b) && !(b < a) in the same "equivalence class" that doesn't assume they
 are actually equal.

 Bottom line, inferring opEquals from opCmp seems fishy.
You're thinking too algorithm-centric :-P When I implement the "comparison operator" for my type, I expect it to be used for comparisons - and that includes equality. If I had the feeling that I could implement == in a more efficient way, or that I actually want equality to have different semantics, I'd just implement opEquals as well. IMHO, everything else would be just confusing to the "average" user, and if someone wants to be confused by counterintuitive rules (however much sense they may make in some way) he could as well just use C++ instead. But if the general view really is that opEquals should *not* be opCmp == 0 by default, for performance reasons or whatever, then please enforce defining opEquals when opCmd is defined, so it's at least explicit that opCmd does not define equality. Cheers, Daniel
Jul 24 2014
parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Thursday, 24 July 2014 at 08:18:22 UTC, Daniel Gibson wrote:
 When I implement the "comparison operator" for my type, I 
 expect it to be used for comparisons - and that includes 
 equality.
 If I had the feeling that I could implement == in a more 
 efficient way, or that I actually want equality to have 
 different semantics, I'd just implement opEquals as well.

 IMHO, everything else would be just confusing to the "average" 
 user, and if someone wants to be confused by counterintuitive 
 rules (however much sense they may make in some way) he could 
 as well just use C++ instead.

 But if the general view really is that opEquals should *not* be 
 opCmp == 0 by default, for performance reasons or whatever, 
 then please enforce defining opEquals when opCmd is defined, so 
 it's at least explicit that opCmd does not define equality.
+1 Silently breaking (IMHO reasonable) expectations is bad.
Jul 25 2014
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 23 July 2014 at 18:53:57 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 However, surely all orderable types are equality-comparable!
just because 2 objects don't have a defined ordering between them doesn't mean they are equal in a more general sense. Yes it's a gotcha but I think it's a worthwhile one.
Jul 23 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 23 July 2014 at 18:53:57 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 That's the wrong way round. I fully agree that we should not
 autogenerate opCmp if the user defines opEquals, since not all 
 types
 comparable with equality are orderable.  However, surely all 
 orderable
 types are equality-comparable! Therefore, if opCmp is defined 
 but
 opEquals isn't, then we should autogenerate opEquals to be the 
 same as
 a.opCmp(b)==0.
You can define an order for sets/intervals without equality... For fuzzy numbers it gets even worse. You can define it such that a<b and b>a both are true...
Jul 23 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jul 23, 2014 at 10:42:19PM +0000, via Digitalmars-d wrote:
 On Wednesday, 23 July 2014 at 18:53:57 UTC, H. S. Teoh via Digitalmars-d
 wrote:
That's the wrong way round. I fully agree that we should not
autogenerate opCmp if the user defines opEquals, since not all types
comparable with equality are orderable.  However, surely all
orderable types are equality-comparable! Therefore, if opCmp is
defined but opEquals isn't, then we should autogenerate opEquals to
be the same as a.opCmp(b)==0.
You can define an order for sets/intervals without equality... For fuzzy numbers it gets even worse. You can define it such that a<b and b>a both are true...
(a<b && b>a) is true for ints. T -- All men are mortal. Socrates is mortal. Therefore all men are Socrates.
Jul 23 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 23 July 2014 at 23:02:48 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 fuzzy numbers it gets even worse. You can define it such that 
 a<b and
 b>a both are true...
(a<b && b>a) is true for ints.
That was a typo, for fuzzy numbers you can define less than such that a<b and b>a both are true. Fuzzy(-1,0,1) vs Fuzzy(-2,0,2)
Jul 23 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Ola Fosheim Grøstad" " wrote in message 
news:qxtukjuohhzngcutmmpz forum.dlang.org...

 On Wednesday, 23 July 2014 at 23:02:48 UTC, H. S. Teoh via Digitalmars-d 
 wrote:
 fuzzy numbers it gets even worse. You can define it such that a<b and
 b>a both are true...
(a<b && b>a) is true for ints.
That was a typo, for fuzzy numbers you can define less than such that a<b and b>a both are true. Fuzzy(-1,0,1) vs Fuzzy(-2,0,2)
a<b and b>a can be true for ints.
Jul 23 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 24 July 2014 at 06:18:34 UTC, Daniel Murphy wrote:
 "Ola Fosheim Grøstad" " wrote in message 
 news:qxtukjuohhzngcutmmpz forum.dlang.org...

 On Wednesday, 23 July 2014 at 23:02:48 UTC, H. S. Teoh via 
 Digitalmars-d wrote:
 fuzzy numbers it gets even worse. You can define it such 
 that a<b and
 b>a both are true...
(a<b && b>a) is true for ints.
That was a typo, for fuzzy numbers you can define less than such that a<b and b>a both are true. Fuzzy(-1,0,1) vs Fuzzy(-2,0,2)
a<b and b>a can be true for ints.
So I keep making the same typo :P... For fuzzy numbers you can define less than such that a<b and b<a both are true... yes?
Jul 23 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Ola Fosheim Grøstad" " wrote in message 
news:duuhouucozvosboibhtc forum.dlang.org...

 For fuzzy numbers you can define less than such that a<b and b<a both are 
 true... yes?
You could, but if you do it with opCmp it looks like operator overloading abuse to me.
Jul 23 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 24 July 2014 at 06:46:08 UTC, Daniel Murphy wrote:
 "Ola Fosheim Grøstad" " wrote in message 
 news:duuhouucozvosboibhtc forum.dlang.org...

 For fuzzy numbers you can define less than such that a<b and 
 b<a both are true... yes?
You could, but if you do it with opCmp it looks like operator overloading abuse to me.
Well, but FuzzyNumbers are fuzzy sets that are treated like scalars in a pragmatic, but imperfect way. It makes sense to state that a vivid design A is both uglier and prettier than a boring and dull design B. I think opCmp is a mistake once you move beyond real scalars. Defining sort order is a separate "tool". Take for instance complex numbers that can be ordered by magnitude, but you need to account for phase (in some arbitrary way since it is circular) to get total order. That does not mean that one should use the sort-comparison for non-sort comparison of complex numbers.
Jul 24 2014
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, 23 July 2014 at 18:53:57 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Wed, Jul 23, 2014 at 11:48:42AM -0700, Andrei Alexandrescu 
 via Digitalmars-d wrote:
 On 7/23/14, 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty 
sure TDPL
says this is the case (unfortunately I'm at work so I can't 
check my
copy of TDPL).

https://issues.dlang.org/show_bug.cgi?id=13179

:-(
It's a good decision. There are types that are comparable for equality but not compared for ordering. -- Andrei
That's the wrong way round. I fully agree that we should not autogenerate opCmp if the user defines opEquals, since not all types comparable with equality are orderable. However, surely all orderable types are equality-comparable! Therefore, if opCmp is defined but opEquals isn't, then we should autogenerate opEquals to be the same as a.opCmp(b)==0.
That would incur a silent performance hit. We should either force the programmer to define opEquals (even if they just make it return a.opCmp(b) == 0) or we should keep the normal, generated one. The best option though would be to provide some way for the programmer to tell the compiler that they want to use the default one so that they still have to declare opEquals when they define opCmp (to make sure that the programmer didn't forget it), but they're still able to use the built-in one rather than writing it themselves. IIRC, C++11 has a way of doing that. Maybe we should add something similar. - Jonathan M Davis
Jul 23 2014
next sibling parent "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Jonathan M Davis"  wrote in message 
news:kquxovegjzzsivftxsud forum.dlang.org...

 The best option though would be to provide some way for the programmer to 
 tell the compiler that they want to use the default one so that they still 
 have to declare opEquals when they define opCmp (to make sure that the 
 programmer didn't forget it), but they're still able to use the built-in 
 one rather than writing it themselves. IIRC, C++11 has a way of doing 
 that. Maybe we should add something similar.
bool opEquals(const ref other) const { return this.tupleof == other.tupleof; }
Jul 23 2014
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 24/07/14 02:41, Jonathan M Davis wrote:

 That would incur a silent performance hit.
So does default initialized values, virtual by default, classes allocated on the heap and other features of D. By default D chooses safety and correctness. If the programmer needs more performance some additional code might be required. -- /Jacob Carlborg
Jul 23 2014
prev sibling next sibling parent reply "Brian Schott" <briancschott gmail.com> writes:
As of about 2 minutes ago, D-Scanner can help with this.

---
struct A {
	int opCmp() const;
}
---

$ ./dscanner --styleCheck ~/tmp/test.d
test.d(1:8)[warn]: 'A' has method 'opCmp', but not 'opEquals'.
Jul 23 2014
parent reply "Martin Nowak" <code dawg.eu> writes:
On Wednesday, 23 July 2014 at 21:14:35 UTC, Brian Schott wrote:
 $ ./dscanner --styleCheck ~/tmp/test.d
 test.d(1:8)[warn]: 'A' has method 'opCmp', but not 'opEquals'.
Nice
Oct 15 2014
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 15 Oct 2014 11:25:10 +0000
schrieb "Martin Nowak" <code dawg.eu>:

 On Wednesday, 23 July 2014 at 21:14:35 UTC, Brian Schott wrote:
 $ ./dscanner --styleCheck ~/tmp/test.d
 test.d(1:8)[warn]: 'A' has method 'opCmp', but not 'opEquals'.
Nice
Yep! Some say opCmp is something entirely different from opEquals. One creates an order between two objects while the other tests for equality. I always had the idea that opCmp supersedes opEquals. What dscanner does is probably the only sane way out :) -- Marco
Oct 15 2014
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, 23 July 2014 at 16:47:40 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 This morning, I discovered this major WAT in D:

 ----
 struct S {
         int x;
         int y;
         int opCmp(S s) {
                 return x - s.x; // compare only x
         }
 }

 void main() {
         auto s1 = S(1,2);
         auto s2 = S(1,3);
         auto s3 = S(2,1);

         assert(s1 < s3); // OK
         assert(s2 < s3); // OK
         assert(s3 > s1); // OK
         assert(s3 > s2); // OK
         assert(s1 <= s2 && s2 >= s1); // OK
         assert(s1 == s2); // FAIL -- WAT??
 }
 ----

 The reason for this is that the <, <=, >=, > operators are 
 defined in
 terms of opCmp (which, btw, is defined to return 0 when the 
 objects
 being compared are equal), but == is defined in terms of 
 opEquals. When
 opEquals is not defined, it defaults to the built-in compiler
 definition, which is a membership equality test, even if opCmp 
 *is*
 defined, and returns 0 when the objects are equal.

 Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure 
 TDPL
 says this is the case (unfortunately I'm at work so I can't 
 check my
 copy of TDPL).

 https://issues.dlang.org/show_bug.cgi?id=13179

 :-(
I would argue that the compiler should still be generating opEquals even if opCmp is defined. Otherwise, even if opCmp is consistent with the built-in opEquals, you'll be forced to reimplement opEquals - and toHash if you're using that type as a key, since once you define opEquals, you have to define toHash. If it makes sense for a type to define opCmp but not define opEquals (which I seriously question), then I think that it should be explicit, in which case, we can use disable, e.g. something like struct S { disable bool opEquals(ref S s); int opCmp(ref S S) { ... } ... } - Jonathan M Davis
Jul 23 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 24 July 2014 at 00:31:55 UTC, Jonathan M Davis wrote:
 I would argue that the compiler should still be generating 
 opEquals even if opCmp is defined.
I take this back. As I later suggested in a post somewhere else in this thread (and the bug report that H.S. Teoh opened), I think that we should continue to not define opEquals, but we should add a way to tell the compiler to use the default-generated one (similar to what C++11 does). That way, the programmer is forced to consider what opEquals is supposed to do when opCmp is defined, but they're still able to use the default-generated one. The same goes for toHash. Regardless, because automatically making opEquals be lhs.opCmp(rhs) == 0 incurs a silent performance hit, I think that it's a bad idea. - Jonathan M Davis
Jul 23 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Jul 24, 2014 at 12:51:31AM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, 24 July 2014 at 00:31:55 UTC, Jonathan M Davis wrote:
I would argue that the compiler should still be generating opEquals
even if opCmp is defined.
I take this back. As I later suggested in a post somewhere else in this thread (and the bug report that H.S. Teoh opened), I think that we should continue to not define opEquals, but we should add a way to tell the compiler to use the default-generated one (similar to what C++11 does). That way, the programmer is forced to consider what opEquals is supposed to do when opCmp is defined, but they're still able to use the default-generated one. The same goes for toHash. Regardless, because automatically making opEquals be lhs.opCmp(rhs) == 0 incurs a silent performance hit, I think that it's a bad idea.
[...] This sounds like a reasonable compromise. If the user defines opCmp, then it is an error not to define opEquals. However, opEquals can be specified to be default, to get the compiler-generated version: struct S { int opCmp(S s) { ... } bool opEquals(S s) = default; } Optionally, we could also allow opEquals to be disabled, perhaps. Same goes with toHash. Keep in mind, though, that due to current AA changes in 2.066 beta, existing code WILL break unless we autogenerate opEquals to be opCmp()=0. In fact, issue 13179 was originally filed because 2.066 beta broke Tango. My current stance is that these AA changes are an improvement that we should keep, so then the question becomes, should we break code over it, or should we introduce opEquals = (opCmp()==0), which would allow existing code *not* to break? Given the choice between (1) breaking code *and* allowing opCmp to be inconsistent with opEquals, as the current situation is, and (2) *not* breaking code and making opEquals consistent with opCmp by default, I would choose (2) as being clearly more advantageous. The above compromise solves the opEquals/opCmp consistency problem, but does not address the inevitable code breakage that will happen when 2.066 is released. Is it really worth the ire of D coders to have their existing code break, for the questionable benefit of being able to make opEquals inconsistent with opCmp just so we can support partial orderings on types? I don't know about you, but if it were up to me, I would much rather go with the solution of setting opEquals = (opCmp()==0) by default, and let the user redefine opEquals if they want partial orderings or eliminate performance hits, etc.. T -- Skill without imagination is craftsmanship and gives us many useful objects such as wickerwork picnic baskets. Imagination without skill gives us modern art. -- Tom Stoppard
Jul 23 2014
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Thursday, 24 July 2014 at 01:39:01 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 Keep in mind, though, that due to current AA changes in 2.066 
 beta,
 existing code WILL break unless we autogenerate opEquals to be
 opCmp()=0. In fact, issue 13179 was originally filed because 
 2.066 beta
 broke Tango.
This is exactly what I was referring to by "you will answer to user complaints". In context of declared backwards compatibility efforts decision to not use pragmatical solution in favor of purist approach that brakes user code does sound strange.
Jul 24 2014
parent "w0rp" <devw0rp gmail.com> writes:
I wonder. If opCmp is supposed to imply partial ordering, then 
that means opCmp should imply the antisymmetric property of 
partial ordering. http://mathworld.wolfram.com/PartialOrder.html

a <= b and b <= a implies a = b.

That would mean for us that opEquals being generated with opCmp 
== 0 would make sense.

Without that transitive property, it would imply only that it was 
a preorder, but not a partial order. 
http://mathworld.wolfram.com/Preorder.html
Jul 24 2014
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 24 July 2014 at 01:39:01 UTC, H. S. Teoh via 
Digitalmars-d wrote:

 Keep in mind, though, that due to current AA changes in 2.066 
 beta,
 existing code WILL break unless we autogenerate opEquals to be
 opCmp()=0. In fact, issue 13179 was originally filed because 
 2.066 beta
 broke Tango. My current stance is that these AA changes are an
 improvement that we should keep, so then the question becomes, 
 should we
 break code over it, or should we introduce opEquals = 
 (opCmp()==0),
 which would allow existing code *not* to break?
Can we just adjust the AA implementation so that it uses lhs.opCmp(rhs) == 0 if opEquals isn't defined and produces a deprecation warning about that? That way, we avoid immediately breaking folks, but we still move towards requiring that they define opEquals. - Jonathan M Davis
Jul 24 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/24/14, 11:30 AM, Jonathan M Davis wrote:
 On Thursday, 24 July 2014 at 01:39:01 UTC, H. S. Teoh via Digitalmars-d
 wrote:

 Keep in mind, though, that due to current AA changes in 2.066 beta,
 existing code WILL break unless we autogenerate opEquals to be
 opCmp()=0. In fact, issue 13179 was originally filed because 2.066 beta
 broke Tango. My current stance is that these AA changes are an
 improvement that we should keep, so then the question becomes, should we
 break code over it, or should we introduce opEquals = (opCmp()==0),
 which would allow existing code *not* to break?
Can we just adjust the AA implementation so that it uses lhs.opCmp(rhs) == 0 if opEquals isn't defined and produces a deprecation warning about that? That way, we avoid immediately breaking folks, but we still move towards requiring that they define opEquals.
I like Daniel's idea to auto-define opEquals as a field-for-field comparison. -- Andrei
Jul 24 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 24 July 2014 at 19:35:12 UTC, Andrei Alexandrescu 
wrote:
 I like Daniel's idea to auto-define opEquals as a 
 field-for-field comparison. -- Andrei
That's basically what the compiler-generated opEquals does (though I think that it'll just to a memcmp if it knows that it can get away with that). So, if that's what you want, you're arguing for just have the compiler still define opEquals for you even if opCmp is defined. And I'm fine with that, but if the concern is code breakage for AAs, and opCmp is not defined in a way that's consistent with opEquals, then that would break them. Now, I think that such types are buggy, so I'm not sure that that's all that big a deal, but if I understand correctly, basically anything that we do with 2.066 which doesn't involve continuing to use lhs.opCmp(rhs) == 0 for the AAs will break them, and we need to change it to use opEquals, so I'm not sure that we _can_ avoid code breakage unless the type defines opEquals, and it defines it in a manner which is consistent with opCmp (which it _should_ do but might not). And if that's the case, then it just comes down to what type of code breakage we want to incur. And IMHO, having the default opEquals continue to be generated when opCmp is defined is a very workable solution, since no types should have defined opCmp in a way that was inconsistent with opEquals. - Jonathan M Davis
Jul 24 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Jul 24, 2014 at 08:36:02PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, 24 July 2014 at 19:35:12 UTC, Andrei Alexandrescu wrote:
I like Daniel's idea to auto-define opEquals as a field-for-field
comparison. -- Andrei
Isn't that what the compiler already does?
 That's basically what the compiler-generated opEquals does (though I
 think that it'll just to a memcmp if it knows that it can get away
 with that). So, if that's what you want, you're arguing for just have
 the compiler still define opEquals for you even if opCmp is defined.
 And I'm fine with that, but if the concern is code breakage for AAs,
 and opCmp is not defined in a way that's consistent with opEquals,
 then that would break them. Now, I think that such types are buggy, so
 I'm not sure that that's all that big a deal, but if I understand
 correctly, basically anything that we do with 2.066 which doesn't
 involve continuing to use lhs.opCmp(rhs) == 0 for the AAs will break
 them, and we need to change it to use opEquals, so I'm not sure that
 we _can_ avoid code breakage unless the type defines opEquals, and it
 defines it in a manner which is consistent with opCmp (which it
 _should_ do but might not). And if that's the case, then it just comes
 down to what type of code breakage we want to incur. And IMHO, having
 the default opEquals continue to be generated when opCmp is defined is
 a very workable solution, since no types should have defined opCmp in
 a way that was inconsistent with opEquals.
[...] Keep in mind that the last sentence means that wrong code (i.e. inconsistent opCmp/opEquals) will silently compile and misbehave at runtime. T -- Answer: Because it breaks the logical sequence of discussion. Question: Why is top posting bad?
Jul 24 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, 24 July 2014 at 21:54:01 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Thu, Jul 24, 2014 at 08:36:02PM +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 And IMHO, having the default opEquals continue to be generated 
 when
 opCmp is defined is a very workable solution, since no types 
 should
 have defined opCmp in
 a way that was inconsistent with opEquals.
[...] Keep in mind that the last sentence means that wrong code (i.e. inconsistent opCmp/opEquals) will silently compile and misbehave at runtime.
Well, that's exactly the behavior in 2.065 from what I can tell. If you don't define opEquals, the compiler defines it for you even if you defined opCmp. And trying out git head, it does exactly the same thing. You only get a compilation error when using AAs, which is pretty weird IMHO, and it seems very wrong to suddenly require that opEquals be defined when the default is still being generated. The only way that anyone is going to have problems is if their opCmp is not consistent with opEquals, which is just plain a bug IMHO, so if switching the AAs to opEquals instead of opCmp causes bugs, you're just swapping one set of bugs for another, which seems fine to me. Certainly, I think that it's stupid to require that opEquals be defined just because you're using the type as an AA key when it's not required otherwise. - Jonathan M Davis
Jul 24 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/23/2014 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
 https://issues.dlang.org/show_bug.cgi?id=13179
Consider also: http://www.reddit.com/r/programming/comments/2bl51j/programming_in_d_a_great_online_book_for_learning/cj75gm9 The current scheme breaks existing code - code that was formerly correct and working. AAs don't make sense if the notion of == on members is invalid. AAs formerly required opCmp, and yes, opCmp could be constructed to give different results for opCmp==0 than ==, but I would expect such an object to not be used in an AA, again because it doesn't make sense. Using the default generated opEquals for AAs may break code, such as the an AA of the structs in the parent post, but it seems unlikely that that code was correct anyway in an AA (as it would give erratic results). Kenji's rebuttal https://issues.dlang.org/show_bug.cgi?id=13179#c2 is probably the best counter-argument, and I'd go with it if it didn't result in code breakage.
Jul 24 2014
next sibling parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 25 July 2014 14:50, Walter Bright via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 7/23/2014 9:45 AM, H. S. Teoh via Digitalmars-d wrote:

 https://issues.dlang.org/show_bug.cgi?id=13179
Consider also: http://www.reddit.com/r/programming/comments/2bl51j/ programming_in_d_a_great_online_book_for_learning/cj75gm9 The current scheme breaks existing code - code that was formerly correct and working. AAs don't make sense if the notion of == on members is invalid. AAs formerly required opCmp, and yes, opCmp could be constructed to give different results for opCmp==0 than ==, but I would expect such an object to not be used in an AA, again because it doesn't make sense. Using the default generated opEquals for AAs may break code, such as the an AA of the structs in the parent post, but it seems unlikely that that code was correct anyway in an AA (as it would give erratic results). Kenji's rebuttal https://issues.dlang.org/show_bug.cgi?id=13179#c2 is probably the best counter-argument, and I'd go with it if it didn't result in code breakage.
I don't really see how opCmp == 0 could be unreliable or unintended. It was deliberately written by the author, so definitely not unintended, and I can't imagine anybody would ever deliberately ignore the == 0 case when implementing an opCmp, or produce logic that works for less or greater, but fails for equal. <= and >= are expressed by opCmp, which imply that testing for equality definitely works as the user intended. In lieu of an opEquals, how can a deliberately implemented opCmp, which we know works in the == case (otherwise <= or >= wouldn't work either) ever be a worse choice than an implicitly generated opEquals? Personally, just skimming through this thread, I find it baffling that this is controversial.
Jul 24 2014
next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 05:52:57 UTC, Manu via Digitalmars-d 
wrote:
 On 25 July 2014 14:50, Walter Bright via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:

 On 7/23/2014 9:45 AM, H. S. Teoh via Digitalmars-d wrote:

 https://issues.dlang.org/show_bug.cgi?id=13179
Consider also: http://www.reddit.com/r/programming/comments/2bl51j/ programming_in_d_a_great_online_book_for_learning/cj75gm9 The current scheme breaks existing code - code that was formerly correct and working. AAs don't make sense if the notion of == on members is invalid. AAs formerly required opCmp, and yes, opCmp could be constructed to give different results for opCmp==0 than ==, but I would expect such an object to not be used in an AA, again because it doesn't make sense. Using the default generated opEquals for AAs may break code, such as the an AA of the structs in the parent post, but it seems unlikely that that code was correct anyway in an AA (as it would give erratic results). Kenji's rebuttal https://issues.dlang.org/show_bug.cgi?id=13179#c2 is probably the best counter-argument, and I'd go with it if it didn't result in code breakage.
I don't really see how opCmp == 0 could be unreliable or unintended. It was deliberately written by the author, so definitely not unintended, and I can't imagine anybody would ever deliberately ignore the == 0 case when implementing an opCmp, or produce logic that works for less or greater, but fails for equal. <= and >= are expressed by opCmp, which imply that testing for equality definitely works as the user intended. In lieu of an opEquals, how can a deliberately implemented opCmp, which we know works in the == case (otherwise <= or >= wouldn't work either) ever be a worse choice than an implicitly generated opEquals? Personally, just skimming through this thread, I find it baffling that this is controversial.
So, in the case where opCmp was defined but not opEquals, instead of using the normal, built-in opEquals (which should already be equivalent to lhs.opCmp(rhs) == 0), we're going to make the compiler generate opEquals as lhs.opCmp(rhs) == 0? That's a silent performance hit for no good reason IMHO. It doesn't even improve correctness except in the cases where the programmer should have been defining opEquals in the first place, because lhs.opCmp(rhs) == 0 wasn't equivalent to the compiler-generate opEquals. So, we'd be making good code worse just to try and fix an existing bug in bad code in order to do what? Not break the already broken code? I can understand wanting to avoid breaking code when changing from using opCmp to using opEquals with AAs, but it's only an issue if the code was already broken by defining opCmp in a way that didn't match opEquals, so if I find it odd that any part of this is controversial, it's the fact that anyone thinks that we should try and avoid breaking code where opEquals and opCmp weren't equivalent. - Jonathan M Davis
Jul 24 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 08:44, Jonathan M Davis wrote:

 So, in the case where opCmp was defined but not opEquals, instead of
 using the normal, built-in opEquals (which should already be equivalent
 to lhs.opCmp(rhs) == 0), we're going to make the compiler generate
 opEquals as lhs.opCmp(rhs) == 0? That's a silent performance hit for no
 good reason IMHO.
So are default initialized variables, virtual by default and other similar cases. D aims for correctness and safety first. With the option to get better performance by, possibly, writing some extra code.
 It doesn't even improve correctness except in the
 cases where the programmer should have been defining opEquals in the
 first place, because lhs.opCmp(rhs) == 0 wasn't equivalent to the
 compiler-generate opEquals. So, we'd be making good code worse just to
 try and fix an existing bug in bad code in order to do what? Not break
 the already broken code?
I don't understand this. How are we making good code worse? If the code was working previously, opCmp == 0 should have had the same result as the default generated opEquals. In that case it's perfectly safe to define opEquals to be opCmp == 0.
 I can understand wanting to avoid breaking code when changing from using
 opCmp to using opEquals with AAs, but it's only an issue if the code was
 already broken by defining opCmp in a way that didn't match opEquals, so
 if I find it odd that any part of this is controversial, it's the fact
 that anyone thinks that we should try and avoid breaking code where
 opEquals and opCmp weren't equivalent.
By defining opEquals to be opCmp == 0 we're: 1. We're not breaking code where it wasn't broken previously 2. We're fixing broken code. That is when opEqual and opCmp == 0 gave different results -- /Jacob Carlborg
Jul 25 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 08:21:26 UTC, Jacob Carlborg wrote:
 By defining opEquals to be opCmp == 0 we're:

 1. We're not breaking code where it wasn't broken previously
 2. We're fixing broken code. That is when opEqual and opCmp == 
 0 gave different results
Code that worked perfectly fine before is now slower, because it's using opCmp for opEquals when it wasn't before. Even worse, if you define opEquals, you're then forced to define toHash, which is much harder to get right. So, in order to avoid a performance hit on opEquals from defining opCmp, you now have to define toHash, which significantly increases the chances of bugs. And regardless of the increased risk of bugs, it's extra code that you shouldn't need to write anyway, because the normal, default opEquals and toHash worked just fine. I honestly have no sympathy for anyone who defined opCmp to be different from the default opEquals but didn't define opEquals. Getting that right is simple, and it's trivial to test for you're unit testing like you should be. I don't want to pay in my code just to make the compiler friendlier to someone who didn't even bother to do something so simple. And any code in that situation has always been broken anyway. I'm _definitely_ not interested in reducing the performance of existing code in order to fix bugs in the code of folks who couldn't get opEquals or opCmp right. I'd much rather be able to take advantage of the fast, default opEquals and correct toHash than be forced to define them just because I defined opCmp and didn't want a performance hit on opEquals. - Jonathan M Davis
Jul 25 2014
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 11:46, Jonathan M Davis wrote:

 Code that worked perfectly fine before is now slower, because it's using
 opCmp for opEquals when it wasn't before.
Who says opCmp need to be slower than opEquals.
 Even worse, if you define
 opEquals, you're then forced to define toHash, which is much harder to
 get right.
That might be a problem. But you can always call the one in TypeInfo.
 So, in order to avoid a performance hit on opEquals from
 defining opCmp
Assuming there is a performance hit. -- /Jacob Carlborg
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 4:18 AM, Jacob Carlborg wrote:
 On 25/07/14 11:46, Jonathan M Davis wrote:

 Code that worked perfectly fine before is now slower, because it's using
 opCmp for opEquals when it wasn't before.
Who says opCmp need to be slower than opEquals.
Consider: struct S { int a,b; } int opCmp(S s2) { return (a == s.a) ? s.b - b : s.a - a; } bool opEquals(S s2) { return *cast(long*)&this == *cast(long*)&s2; } Because of byte ordering variations, the cast trick wouldn't work reliably for opCmp. Do people do such things? Yes, since opEquals can very likely be in critical performance loops.
Jul 25 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 21:01:49 UTC, Walter Bright wrote:
 On 7/25/2014 4:18 AM, Jacob Carlborg wrote:
 On 25/07/14 11:46, Jonathan M Davis wrote:

 Code that worked perfectly fine before is now slower, because 
 it's using
 opCmp for opEquals when it wasn't before.
Who says opCmp need to be slower than opEquals.
Consider: struct S { int a,b; } int opCmp(S s2) { return (a == s.a) ? s.b - b : s.a - a; } bool opEquals(S s2) { return *cast(long*)&this == *cast(long*)&s2; } Because of byte ordering variations, the cast trick wouldn't work reliably for opCmp. Do people do such things? Yes, since opEquals can very likely be in critical performance loops.
Doesn't the compiler-generated opEquals do a memcmp when it can? Obviously, it can't always, but in the simple POD cases (that don't involve floating point values anyway), it should be able to it. - Jonathan M Davis
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 2:55 PM, Jonathan M Davis wrote:
 Doesn't the compiler-generated opEquals do a memcmp when it can?
Yes.
Jul 25 2014
parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 25 July 2014 at 22:29:30 UTC, Walter Bright wrote:
 On 7/25/2014 2:55 PM, Jonathan M Davis wrote:
 Doesn't the compiler-generated opEquals do a memcmp when it 
 can?
Yes.
And even sometime when it cannot :D (floats for instance).
Aug 16 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 09:46:55AM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Friday, 25 July 2014 at 08:21:26 UTC, Jacob Carlborg wrote:
By defining opEquals to be opCmp == 0 we're:

1. We're not breaking code where it wasn't broken previously
2. We're fixing broken code. That is when opEqual and opCmp == 0 gave
different results
Code that worked perfectly fine before is now slower, because it's using opCmp for opEquals when it wasn't before.
I don't understand why you keep bringing up the point of being slower. I thought the whole point of D was to be safe first, then performant if you ask for it. In this case, sure there will be a (small!) performance hit, but then the solution is just to define opEquals yourself -- which you should have been doing in the first place! So this is really just prodding the programmer in the right direction.
 Even worse, if you define opEquals, you're then forced to define
 toHash, which is much harder to get right.
If you're redefining opCmp and opEquals, I seriously question whether the default toHash actually produces the correct result. If it did, it begs the question, what's the point of redefining opCmp?
 So, in order to avoid a performance hit on opEquals from defining
 opCmp, you now have to define toHash, which significantly increases
 the chances of bugs. And regardless of the increased risk of bugs,
 it's extra code that you shouldn't need to write anyway, because the
 normal, default opEquals and toHash worked just fine.
 
 I honestly have no sympathy for anyone who defined opCmp to be
 different from the default opEquals but didn't define opEquals.
 Getting that right is simple, and it's trivial to test for you're unit
 testing like you should be.
Frankly, I find this rather incongruous. First you say that requiring programmers to define toHash themselves is too high an expectation, then you say that you have no sympathy on these same programmers 'cos they can't get their opEquals code right. If it's too much to expect them to write toHash properly, why would we expect them to write opEquals correctly either? But if they *are* expected to get opEquals right, then why is it a problem for them to also get toHash right? I'm honestly baffled at what your point is.
 I don't want to pay in my code just to make the compiler friendlier to
 someone who didn't even bother to do something so simple.
[...] And you don't have to. You just define opEquals correctly as you have always done, and you pay *nothing*. The only time you pay is when you forgot to define opEquals -- in which case, which is worse, bad performance, or incorrect code? Perhaps you have different priorities, but I'd rather have bad performance than incorrect code, especially *subtly* wrong code that's very difficult to track down. [...]
 I'd much rather be able to take advantage of the fast, default
 opEquals and correct toHash than be forced to define them just because
 I defined opCmp and didn't want a performance hit on opEquals.
[...] So perhaps we should implement `bool opEquals = default;`. T -- My program has no bugs! Only unintentional features...
Jul 25 2014
next sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"H. S. Teoh via Digitalmars-d"  wrote in message 
news:mailman.336.1406295294.32463.digitalmars-d puremagic.com...

 So perhaps we should implement `bool opEquals = default;`.
No new syntax.
Jul 25 2014
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 11:42:59PM +1000, Daniel Murphy via Digitalmars-d wrote:
 "H. S. Teoh via Digitalmars-d"  wrote in message
 news:mailman.336.1406295294.32463.digitalmars-d puremagic.com...
 
So perhaps we should implement `bool opEquals = default;`.
No new syntax.
*shrug* That's just what Jonathan suggested earlier. At worst, you could just write: bool opEquals(T t) { return typeid(this).equals(&this, &t); } It's a little more typing, but surely it's not *that* hard?? T -- "Hi." "'Lo."
Jul 25 2014
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 13:34:55 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Fri, Jul 25, 2014 at 09:46:55AM +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 Even worse, if you define opEquals, you're then forced to 
 define
 toHash, which is much harder to get right.
If you're redefining opCmp and opEquals, I seriously question whether the default toHash actually produces the correct result. If it did, it begs the question, what's the point of redefining opCmp?
Except that with the current git master, you're forced to define opEquals just because you define opCmp, which would then force you to define opCmp. And with your suggested fix of making opEquals equivalent to lhs.opCmp(rhs) == 0, then _every_ type with opCmp will have to define toHash, because the default toHash is for the default opEquals, not for a user-defined opCmp. And remember that a lot of types have opCmp just to work with AAs, so all of a sudden, _every_ user-defined type which is used as an AA key will have to define toHash.
 Frankly, I find this rather incongruous. First you say that 
 requiring
 programmers to define toHash themselves is too high an 
 expectation, then
 you say that you have no sympathy on these same programmers 
 'cos they
 can't get their opEquals code right. If it's too much to expect 
 them to
 write toHash properly, why would we expect them to write 
 opEquals
 correctly either? But if they *are* expected to get opEquals 
 right, then
 why is it a problem for them to also get toHash right? I'm 
 honestly
 baffled at what your point is.
opEquals is trivial. toHash is much harder to get right, especially if you want a hash function that's even halfway decent. - Jonathan M Davis
Jul 25 2014
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 25.07.2014 20:45, schrieb Jonathan M Davis:
 On Friday, 25 July 2014 at 13:34:55 UTC, H. S. Teoh via Digitalmars-d
 wrote:
 On Fri, Jul 25, 2014 at 09:46:55AM +0000, Jonathan M Davis via
 Digitalmars-d wrote:
 Even worse, if you define opEquals, you're then forced to define
 toHash, which is much harder to get right.
If you're redefining opCmp and opEquals, I seriously question whether the default toHash actually produces the correct result. If it did, it begs the question, what's the point of redefining opCmp?
Except that with the current git master, you're forced to define opEquals just because you define opCmp, which would then force you to define opCmp.
That sentence doesn't make much sense, did you mean "opHash just because you define opEquals" or something similar?
 opEquals is trivial. toHash is much harder to get right, especially if
 you want a hash function that's even halfway decent.
As written before somewhere else, a toHash that is as decent as the current default, but limited to the fields you actually want, should be really easy, if phobos exposed a function like hash_t createHash(T...)(T args) that does to args what D currently does to all members to create the default hash. (I kinda tend towards "whatever, maybe not defaulting opEquals to a.opCmp(b) == 0 is acceptable, people should stumble upon this when first reading the documentation on how to overload operators" now, though) Cheers, Daniel
Jul 25 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 18:54:07 UTC, Daniel Gibson wrote:
 Am 25.07.2014 20:45, schrieb Jonathan M Davis:
 On Friday, 25 July 2014 at 13:34:55 UTC, H. S. Teoh via 
 Digitalmars-d
 wrote:
 On Fri, Jul 25, 2014 at 09:46:55AM +0000, Jonathan M Davis via
 Digitalmars-d wrote:
 Even worse, if you define opEquals, you're then forced to 
 define
 toHash, which is much harder to get right.
If you're redefining opCmp and opEquals, I seriously question whether the default toHash actually produces the correct result. If it did, it begs the question, what's the point of redefining opCmp?
Except that with the current git master, you're forced to define opEquals just because you define opCmp, which would then force you to define opCmp.
That sentence doesn't make much sense, did you mean "opHash just because you define opEquals" or something similar?
You're right. I meant that you would then be forced to define toHash. - Jonathan M Davis
Jul 25 2014
prev sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 25 July 2014 at 18:45:30 UTC, Jonathan M Davis wrote:
 On Friday, 25 July 2014 at 13:34:55 UTC, H. S. Teoh via 
 Digitalmars-d wrote:
 On Fri, Jul 25, 2014 at 09:46:55AM +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 Even worse, if you define opEquals, you're then forced to 
 define
 toHash, which is much harder to get right.
If you're redefining opCmp and opEquals, I seriously question whether the default toHash actually produces the correct result. If it did, it begs the question, what's the point of redefining opCmp?
Except that with the current git master, you're forced to define opEquals just because you define opCmp, which would then force you to define opCmp. And with your suggested fix of
(assuming you mean "toHash")
 making opEquals equivalent to lhs.opCmp(rhs) == 0, then _every_ 
 type with opCmp will have to define toHash, because the default 
 toHash is for the default opEquals, not for a user-defined 
 opCmp.
No, only those types that define opCmp _and_ are going to be used as AA keys, and that's sensible. All others don't need toHash.
 And remember that a lot of types have opCmp just to work with 
 AAs, so all of a sudden, _every_ user-defined type which is 
 used as an AA key will have to define toHash.
No, if a type had only defined opCmp because of the previous AA (mis)implementation, it needs to be changed with any of the suggested solutions: If opEquals is not going to be auto-generated, the user needs to add it, if it is, the user has the choice between adding toHash, or (more likely, as opCmp usually isn't necessary) changing opCmp into opEquals.
Jul 25 2014
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 19:03:16 UTC, Marc Schütz wrote:
 On Friday, 25 July 2014 at 18:45:30 UTC, Jonathan M Davis wrote:
 And remember that a lot of types have opCmp just to work with 
 AAs, so all of a sudden, _every_ user-defined type which is 
 used as an AA key will have to define toHash.
No, if a type had only defined opCmp because of the previous AA (mis)implementation, it needs to be changed with any of the suggested solutions: If opEquals is not going to be auto-generated, the user needs to add it, if it is, the user has the choice between adding toHash, or (more likely, as opCmp usually isn't necessary) changing opCmp into opEquals.
No, if the opCmp is consistent with the default-generated opEquals, then there's no need to define either opEquals or opCmp. And if opCmp was inconsistent with the default opEquals, then toHash would have already had to have been defined to be consistent with opCmp, or the AA wouldn't have worked properly regardless. So, if we let opEquals and toHash continue to be generated by the compiler when opCmp is defined, the only folks who would now have to define opEquals or toHash who didn't before would be the folks who should have been defining them previously to be consistent with opCmp but didn't. Whereas with the current git master, they'd all have to define opEquals and toHash. With H.S. Teoh's suggestion, opEquals wouldn't have to be defined but toHash would (since the compiler has no way of knowing that lhs.opCmp(rhs) == 0 is equivalent to the default opEquals that the default toHash is consistent with). The current option breaks all AA-related code that didn't define opEquals and toHash already (which a lot of code didn't have to do), and H.S. Teoh's option breaks all AA-related code that didn't define toHash already. So, the option that causes the least code breakage is to let the compiler continue to define opEquals and toHash as it has, even if opCmp has been defined. The only risk is if opCmp wasn't consistent with the default opEquals, but if that's the case, the code was already broken anyway. - Jonathan M Davis
Jul 25 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 12:03 PM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 No, if a type had only defined opCmp because of the previous AA
 (mis)implementation,
It was not a misimplementation. The previous implementation used a hash lookup with a binary tree for collisions, hence it needed cmp. It was perfectly correct. The newer one uses a linear list for collisions, hence it only needs ==.
Jul 25 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 02:05:13PM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/25/2014 12:03 PM, "Marc Schütz" <schuetzm gmx.net>" wrote:
No, if a type had only defined opCmp because of the previous AA
(mis)implementation,
It was not a misimplementation. The previous implementation used a hash lookup with a binary tree for collisions, hence it needed cmp. It was perfectly correct. The newer one uses a linear list for collisions, hence it only needs ==.
So it sounds like Jonathan's solution is the best after all: get rid of the error message that demands that opEquals be defined when opCmp is defined. Previous code that defined opCmp for AA keys will then continue to work (except if their opCmp was inconsistent with the default opEquals, in which case they are already buggy and we're not making things worse). On a related note, it would be nice if AA key types that did define opCmp can have better than linear collision resolution. But that belongs in another discussion. On another related note, the compiler's treatment of opCmp is inconsistent: struct S { int opCmp(S s) { return 0; } } int[S] aa1; // OK struct T { int opCmp(T s) const { return 0; } } int[T] aa2; // Compile error: must also define opEquals Worse yet: struct S { int x; int opCmp(S s) /*const*/ { return s.x - x; } } void main() { auto s1 = S(1); auto s2 = S(2); assert(s1 > s2); assert(s2 < s1); assert(typeid(s1).compare(&s1, &s2) > 0); assert(typeid(s2).compare(&s2, &s1) < 0); } This produces a runtime error: object.Error (0): TypeInfo.compare is not implemented Uncommenting the const in opCmp fixes the problem. WAT? T -- Без труда не выловишь и рыбку из пруда.
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 2:44 PM, H. S. Teoh via Digitalmars-d wrote:
 Uncommenting the const in opCmp fixes the problem.  WAT?
opCmp must be const in order to be recognized for TypeInfo.
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 3:31 PM, Walter Bright wrote:
 On 7/25/2014 2:44 PM, H. S. Teoh via Digitalmars-d wrote:
 Uncommenting the const in opCmp fixes the problem.  WAT?
opCmp must be const in order to be recognized for TypeInfo.
See https://dlang.org/operatoroverloading#compare
Jul 25 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 03:33:24PM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/25/2014 3:31 PM, Walter Bright wrote:
On 7/25/2014 2:44 PM, H. S. Teoh via Digitalmars-d wrote:
Uncommenting the const in opCmp fixes the problem.  WAT?
opCmp must be const in order to be recognized for TypeInfo.
See https://dlang.org/operatoroverloading#compare
That page doesn't say anything about TypeInfo, though. But even then, why doesn't the compiler reject opCmp signatures that don't match the compiler's expectations? T -- It's amazing how careful choice of punctuation can leave you hanging:
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 4:28 PM, H. S. Teoh via Digitalmars-d wrote:
 That page doesn't say anything about TypeInfo, though.
It says to follow the form.
 But even then,
 why doesn't the compiler reject opCmp signatures that don't match the
 compiler's expectations?
Because you may want to use an opCmp for other purposes.
Jul 25 2014
parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 July 2014 00:31, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 7/25/2014 4:28 PM, H. S. Teoh via Digitalmars-d wrote:
 That page doesn't say anything about TypeInfo, though.
It says to follow the form.
 But even then,
 why doesn't the compiler reject opCmp signatures that don't match the
 compiler's expectations?
Because you may want to use an opCmp for other purposes.
I think this is bad practice. I did write a terse API for a library I'm building up, where: return a < b; Would return codegen describing the operation of (a < b). I had second thoughts about it though, mostly because it was perhaps a bit *too* magical. :) Iain
Jul 28 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/24/2014 10:52 PM, Manu via Digitalmars-d wrote:
 I don't really see how opCmp == 0 could be unreliable or unintended. It was
 deliberately written by the author, so definitely not unintended, and I can't
 imagine anybody would ever deliberately ignore the == 0 case when implementing
 an opCmp, or produce logic that works for less or greater, but fails for equal.
 <= and >= are expressed by opCmp, which imply that testing for equality
 definitely works as the user intended.
Yes, that's why it's hard to see that it would break existing code, unless that existing code had a bug in it that was worked around in some peculiar way.
 In lieu of an opEquals, how can a deliberately implemented opCmp, which we know
 works in the == case (otherwise <= or >= wouldn't work either) ever be a worse
 choice than an implicitly generated opEquals?
Determining an ordering can sometimes be more expensive. It is, after all, asking for more information.
Jul 24 2014
next sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Walter Bright"  wrote in message news:lqsunn$2ke5$1 digitalmars.com...

 Determining an ordering can sometimes be more expensive. It is, after all, 
 asking for more information.
It could also be significantly cheaper, if only a subset of fields need to be compared.
Jul 25 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 07:21:11 UTC, Daniel Murphy wrote:
 "Walter Bright"  wrote in message 
 news:lqsunn$2ke5$1 digitalmars.com...

 Determining an ordering can sometimes be more expensive. It 
 is, after all, asking for more information.
It could also be significantly cheaper, if only a subset of fields need to be compared.
If that's the case, then the default opEquals isn't correct, and the programmer should have already defined opEquals. If they didn't, then their code is broken, and I see no reason to penalize the folks who wrote correct code just to fix someone else's broken code by then defining opEquals in terms of opCmp. - Jonathan M Davis
Jul 25 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Jonathan M Davis"  wrote in message 
news:zcutsbuilcttvbuahmlc forum.dlang.org...

 If that's the case, then the default opEquals isn't correct, and the 
 programmer should have already defined opEquals. If they didn't, then 
 their code is broken, and I see no reason to penalize the folks who wrote 
 correct code just to fix someone else's broken code by then defining 
 opEquals in terms of opCmp.
Just because not all fields _need_ to be compared doesn't mean the default opEquals was incorrect. The ignored fields could be cached values calculated from the other fields.
Jul 25 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 07:34:29 UTC, Daniel Murphy wrote:
 "Jonathan M Davis"  wrote in message 
 news:zcutsbuilcttvbuahmlc forum.dlang.org...

 If that's the case, then the default opEquals isn't correct, 
 and the programmer should have already defined opEquals. If 
 they didn't, then their code is broken, and I see no reason to 
 penalize the folks who wrote correct code just to fix someone 
 else's broken code by then defining opEquals in terms of opCmp.
Just because not all fields _need_ to be compared doesn't mean the default opEquals was incorrect. The ignored fields could be cached values calculated from the other fields.
True. I didn't think of that. But even if that's the case, if they don't define opEquals, then they've always been getting an opEquals which compares all of them. The only place that they would have gotten better performance would have be when the type was used as the key in an AA, since that will now use opEquals instead of opCmp. But if they want to get that efficiency gain, then they can just define opEquals themselves - and if they really cared about that gain, they would have already defined opEquals themselves anyway, because the cases other than AAs would have been using the default opEquals. So, while you have a good point that opCmp _can_ be more efficient than opEquals, it usually isn't, and the places where that would make a difference should already be defining opEquals anyway, meaning that changing the default opEquals to use opCmp wouldn't gain them anything unless they didn't care about that efficiency gain. - Jonathan M Davis
Jul 25 2014
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 08:50, Walter Bright wrote:

 Yes, that's why it's hard to see that it would break existing code,
 unless that existing code had a bug in it that was worked around in some
 peculiar way.
If the type was only used as an AA key and never checked for equivalent then it worked correctly when opCmp was used for AA keys. Also, adding an opEqual to call opCmp == 0 will make it work for equivalent as well, even though it's never used. And it will fix the breaking change with AA keys. -- /Jacob Carlborg
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 1:27 AM, Jacob Carlborg wrote:
 On 25/07/14 08:50, Walter Bright wrote:

 Yes, that's why it's hard to see that it would break existing code,
 unless that existing code had a bug in it that was worked around in some
 peculiar way.
If the type was only used as an AA key and never checked for equivalent then it worked correctly when opCmp was used for AA keys. Also, adding an opEqual to call opCmp == 0 will make it work for equivalent as well, even though it's never used. And it will fix the breaking change with AA keys.
The thing is, either this suffers from == behaving differently than AAs, or you've made opEquals superfluous by defining it to be opCmp==0. The latter is a mistake, as Andrei has pointed out, as opCmp may not have a concept of equality, and opEquals may not have a concept of ordering. I.e. it's not just about AAs.
Jul 25 2014
prev sibling parent Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 25 July 2014 16:50, Walter Bright via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 7/24/2014 10:52 PM, Manu via Digitalmars-d wrote:

 I don't really see how opCmp == 0 could be unreliable or unintended. It
 was
 deliberately written by the author, so definitely not unintended, and I
 can't
 imagine anybody would ever deliberately ignore the == 0 case when
 implementing
 an opCmp, or produce logic that works for less or greater, but fails for
 equal.
 <= and >= are expressed by opCmp, which imply that testing for equality
 definitely works as the user intended.
Yes, that's why it's hard to see that it would break existing code, unless that existing code had a bug in it that was worked around in some peculiar way.
Indeed. In lieu of an opEquals, how can a deliberately implemented opCmp, which we
 know
 works in the == case (otherwise <= or >= wouldn't work either) ever be a
 worse
 choice than an implicitly generated opEquals?
Determining an ordering can sometimes be more expensive. It is, after all, asking for more information.
Correctness has always been the first criteria to satisfy in D. The user is always able to produce faster code with deliberate effort, and that's true in this case too, but you can't have something with a high probability of being incorrect be the default...? In lieu of opEquals, and opCmp exists, the probability of being correct is super-biased towards the user supplied opCmp==0, which must already support <=/>= and therefore almost certainly correct, than some compiler generated guess, which has no insight into the object, and can only possibly be correct in the event you're lucky... I'm a user who's concerned with performance more than most, but there's no way I can buy into that argument in this case. It's just wrong, and the sort of bug that this is likely to produce are highly surprising, very easily overlooked, and likely result in many lost hours to track down. It's the sort of bug that nobody wants to be tracking down. All that said, I'm not even convinced that there would be a performance advantage anyway. I'd be surprised if the optimiser wouldn't produce correct code for 'a-b==0' vs 'a==b'. These are trivial things that optimisers have been extremely good at for decades. If I had to guess at which one offered a performance advantage, I'd say that they'd likely be the same (because optimisers work well with that sort of input), or the advantage would go to the user opCmp. The reason I say that, is that user supplied opCmp may compare *at most* every field (and therefore likely perform the same), but in reality, there's a good chance that the comparison requires comparing only a subset of fields - a user struct is likely to contain some irrelevant fields, cache data perhaps, whatever - and therefore comparing less stuff would more likely yield a performance advantage.
Jul 25 2014
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 04:50:38 UTC, Walter Bright wrote:
 On 7/23/2014 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
 https://issues.dlang.org/show_bug.cgi?id=13179
Consider also: http://www.reddit.com/r/programming/comments/2bl51j/programming_in_d_a_great_online_book_for_learning/cj75gm9 The current scheme breaks existing code - code that was formerly correct and working. AAs don't make sense if the notion of == on members is invalid. AAs formerly required opCmp, and yes, opCmp could be constructed to give different results for opCmp==0 than ==, but I would expect such an object to not be used in an AA, again because it doesn't make sense. Using the default generated opEquals for AAs may break code, such as the an AA of the structs in the parent post, but it seems unlikely that that code was correct anyway in an AA (as it would give erratic results).
Exactly. The only reason that switching from using lhs.opCmp(rhs) == 0 to opEquals would break code is if a type does not define them such that they're equivalent, which would mean that opEquals and/or opCmp was defined in a buggy manner. So, the only way that the change would break code is if it was broken in the first place. All it risks is making it so that the bug exhibits itself in an additional case.
 Kenji's rebuttal 
 https://issues.dlang.org/show_bug.cgi?id=13179#c2 is probably 
 the best counter-argument, and I'd go with it if it didn't 
 result in code breakage.
Yeah. It wouldn't be all that bad to do something similar to C++11 and make it so that we explicitly indicate when we want to use the default opEquals (or toHash), but doing so would break code, and outright just making it so that opEquals must be defined when AAs are used without allowing a way for the compiler-generated opEquals or toHash to be used seems very broken to me. With such a change, _no_ existing user-defined type would be able to use the built-in opEquals or toHash functions if they're used with AAs, which is particularly bad in the case of toHash, since that's much harder to get right. - Jonathan M Davis
Jul 24 2014
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 08:31, Jonathan M Davis wrote:

 Exactly. The only reason that switching from using lhs.opCmp(rhs) == 0
 to opEquals would break code is if a type does not define them such that
 they're equivalent, which would mean that opEquals and/or opCmp was
 defined in a buggy manner. So, the only way that the change would break
 code is if it was broken in the first place. All it risks is making it
 so that the bug exhibits itself in an additional case.
If the type is only used as an AA key and never checked for equivalent it worked when opCmp as used for AA keys. -- /Jacob Carlborg
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 1:28 AM, Jacob Carlborg wrote:
 If the type is only used as an AA key and never checked for equivalent it
worked
 when opCmp as used for AA keys.
Then we'll just get another bug report from AAs behaving differently from ==.
Jul 25 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 10:55, Walter Bright wrote:

 Then we'll just get another bug report from AAs behaving differently
 from ==.
No, not as long as it's not used. -- /Jacob Carlborg
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 2:04 AM, Jacob Carlborg wrote:
 On 25/07/14 10:55, Walter Bright wrote:

 Then we'll just get another bug report from AAs behaving differently
 from ==.
No, not as long as it's not used.
Well, that's a forlorn hope :-)
Jul 25 2014
prev sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 7/25/14, 3:31 AM, Jonathan M Davis wrote:
 On Friday, 25 July 2014 at 04:50:38 UTC, Walter Bright wrote:
 On 7/23/2014 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
 https://issues.dlang.org/show_bug.cgi?id=13179
Consider also: http://www.reddit.com/r/programming/comments/2bl51j/programming_in_d_a_great_online_book_for_learning/cj75gm9 The current scheme breaks existing code - code that was formerly correct and working. AAs don't make sense if the notion of == on members is invalid. AAs formerly required opCmp, and yes, opCmp could be constructed to give different results for opCmp==0 than ==, but I would expect such an object to not be used in an AA, again because it doesn't make sense. Using the default generated opEquals for AAs may break code, such as the an AA of the structs in the parent post, but it seems unlikely that that code was correct anyway in an AA (as it would give erratic results).
Exactly. The only reason that switching from using lhs.opCmp(rhs) == 0 to opEquals would break code is if a type does not define them such that they're equivalent, which would mean that opEquals and/or opCmp was defined in a buggy manner. So, the only way that the change would break code is if it was broken in the first place. All it risks is making it so that the bug exhibits itself in an additional case.
Not at all. If you have a type that has partial ordering (only cares about opCmp, not about opEquals), but still keeps the default opEquals, then this would silently break someone's code by changing their opEquals semantic. THIS is the breaking change.
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 5:10 AM, Ary Borenszweig wrote:
 Not at all.

 If you have a type that has partial ordering (only cares about opCmp, not about
 opEquals), but still keeps the default opEquals, then this would silently break
 someone's code by changing their opEquals semantic.

 THIS is the breaking change.
Yes. A subtle but extremely important point. Comparison and Equality are fundamentally different operations. Defining opEquals to be the equivalent of opCmp==0 is utterly breaking that.
Jul 25 2014
next sibling parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 July 2014 06:35, Walter Bright via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 7/25/2014 5:10 AM, Ary Borenszweig wrote:

 Not at all.

 If you have a type that has partial ordering (only cares about opCmp, not
 about
 opEquals), but still keeps the default opEquals, then this would silently
 break
 someone's code by changing their opEquals semantic.

 THIS is the breaking change.
Yes. A subtle but extremely important point. Comparison and Equality are fundamentally different operations. Defining opEquals to be the equivalent of opCmp==0 is utterly breaking that.
Perhaps the problem here is that there is a missing concept. There is equality and equivalence, and only equivalence is expressed in D (ie, the one that is relevant among the suite of comparison operations). Would you argue that == and != are unrelated, distinct and separate operations from <,<=,>=,>, and they should never be used in conjunction, or assumed to be related? I think any reasonable person will assume that the suite of comparisons are related operations. So perhaps === is missing, and that's what should be used for AA's, and also the thing that actually matches the compiler's generated opEquals...?
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 8:47 PM, Manu via Digitalmars-d wrote:
 Would you argue that == and != are unrelated, distinct and separate operations
 from <,<=,>=,>, and they should never be used in conjunction, or assumed to be
 related?
 I think any reasonable person will assume that the suite of comparisons are
 related operations.
Any reasonable person would assume that floating point (a+b)+c == a+(b+c) but it does not work that way. Andrei gave a specific example in this thread. There is a reason that both opEquals and opCmp defined in the language, and that opEquals is for == and !=, and opCmp is for < <= > >=. Conflating them together is a mistake.
Jul 25 2014
prev sibling next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Sat, Jul 26, 2014 at 01:47:20PM +1000, Manu via Digitalmars-d wrote:
 On 26 July 2014 06:35, Walter Bright via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:
[...]
 So perhaps === is missing, and that's what should be used for AA's,
 and also the thing that actually matches the compiler's generated
 opEquals...?
No! Please don't. '===' is one of the worst WATs of modern day language design, and only leads to confusion and pain. And endless bugs from people not understanding (and not bothering nor wanting to understand) what the difference is. One only has to look at the insanity surrounding === in Javascript and PHP for ample reasons why this is a bad idea. Please don't dainbramage D by introducing this nastiness. T -- Старый друг лучше новых двух.
Jul 25 2014
prev sibling parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 July 2014 16:13, H. S. Teoh via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On Sat, Jul 26, 2014 at 01:47:20PM +1000, Manu via Digitalmars-d wrote:
 On 26 July 2014 06:35, Walter Bright via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:
[...]
 So perhaps === is missing, and that's what should be used for AA's,
 and also the thing that actually matches the compiler's generated
 opEquals...?
No! Please don't. '===' is one of the worst WATs of modern day language design, and only leads to confusion and pain. And endless bugs from people not understanding (and not bothering nor wanting to understand) what the difference is. One only has to look at the insanity surrounding === in Javascript and PHP for ample reasons why this is a bad idea. Please don't dainbramage D by introducing this nastiness.
It's okay, I hate it too. But I equally can't abide == meaning something different than <, <=, etc. That's insane. Like I said, I'm just absolutely astonished that people think the situation in your OP is okay, especially when the solution is so obvious.
Jul 25 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 26 July 2014 at 06:50:11 UTC, Manu via Digitalmars-d 
wrote:
 It's okay, I hate it too.
 But I equally can't abide == meaning something different than 
 <, <=, etc.
 That's insane.
Yes, it is unsound to use opCmp for types that aren't totally ordered: http://en.wikipedia.org/wiki/Partially_ordered_set «In general two elements x and y of a partial order may stand in any of four mutually exclusive relationships to each other: either x < y, or x = y, or x > y, or x and y are incomparable (none of the other three). A totally ordered set is one that rules out this fourth possibility: all pairs of elements are comparable and we then say that trichotomy holds. The natural numbers, the integers, the rationals, and the reals are all totally ordered by their algebraic (signed) magnitude whereas the complex numbers are not.»
 Like I said, I'm just absolutely astonished that people think 
 the situation
 in your OP is okay, especially when the solution is so obvious.
Right, when you only have 3 states, you should require total order.
Jul 26 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Saturday, 26 July 2014 at 07:42:05 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 26 July 2014 at 06:50:11 UTC, Manu via 
 Digitalmars-d wrote:
 It's okay, I hate it too.
 But I equally can't abide == meaning something different than 
 <, <=, etc.
 That's insane.
Yes, it is unsound to use opCmp for types that aren't totally ordered:
Yes, that's why it's possible to provide opEquals in addition to opCmp. But for the vast majority of cases, opEquals _is_ equivalent to opCmp == 0, and element-wise equality is not. Defining opEquals to be the latter by default _even in the presence of opCmp_ is therefore wrong in almost all cases.
Jul 26 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/26/14, 2:19 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 On Saturday, 26 July 2014 at 07:42:05 UTC, Ola Fosheim Grøstad wrote:
 On Saturday, 26 July 2014 at 06:50:11 UTC, Manu via Digitalmars-d wrote:
 It's okay, I hate it too.
 But I equally can't abide == meaning something different than <, <=,
 etc.
 That's insane.
Yes, it is unsound to use opCmp for types that aren't totally ordered:
Yes, that's why it's possible to provide opEquals in addition to opCmp. But for the vast majority of cases, opEquals _is_ equivalent to opCmp == 0, and element-wise equality is not. Defining opEquals to be the latter by default _even in the presence of opCmp_ is therefore wrong in almost all cases.
Case-insensitive ordering is a simple example. Field for field comparison is the right default whether or not opCmp is defined. Andrei
Jul 26 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 26 July 2014 at 09:48:55 UTC, Andrei Alexandrescu 
wrote:
 On 7/26/14, 2:19 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 Yes, that's why it's possible to provide opEquals in addition 
 to opCmp.
Not quite, opCmp would then have to throw if opCmp(a,b) is incomparable. Conflating incomparable and equal values as 0 is a bad idea when sorting. That means incomparable values are sprinkled randomly over the sort.
 Case-insensitive ordering is a simple example.
That doesn't sound right. "<=" is defined for all possible pairs. Case insensitive ordering satisfies the totality requirement: (a <= b or b <= a), for all strings a and b?
Jul 26 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Saturday, 26 July 2014 at 10:43:08 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 26 July 2014 at 09:48:55 UTC, Andrei Alexandrescu 
 wrote:
 On 7/26/14, 2:19 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 Yes, that's why it's possible to provide opEquals in addition 
 to opCmp.
Not quite, opCmp would then have to throw if opCmp(a,b) is incomparable. Conflating incomparable and equal values as 0 is a bad idea when sorting. That means incomparable values are sprinkled randomly over the sort.
Ok, I see what you mean, and I agree. If you can have incomparable elements, you cannot sort reliably. But you were responding to Manu:
 But I equally can't abide == meaning something different than 
 <, <=, etc.
 That's insane.
I somehow took your response as disagreement with him.
Jul 26 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 26 July 2014 at 12:19:41 UTC, Marc Schütz wrote:
 I somehow took your response as disagreement with him.
No, I think it is useful to have a default total sort order for types even if the order isn't natural and might appear arbitrary. It is useful in binary trees etc. That does not mean that you should also define comparison and equality, just because you have a default total order for sorting. The constraints put on comparison operators are too restrictive for domain specific types (such as fuzzy numbers, intervals etc)
Jul 26 2014
prev sibling next sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Saturday, 26 July 2014 at 09:48:55 UTC, Andrei Alexandrescu 
wrote:
 On 7/26/14, 2:19 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:
 On Saturday, 26 July 2014 at 07:42:05 UTC, Ola Fosheim Grøstad 
 wrote:
 On Saturday, 26 July 2014 at 06:50:11 UTC, Manu via 
 Digitalmars-d wrote:
 It's okay, I hate it too.
 But I equally can't abide == meaning something different 
 than <, <=,
 etc.
 That's insane.
Yes, it is unsound to use opCmp for types that aren't totally ordered:
Yes, that's why it's possible to provide opEquals in addition to opCmp. But for the vast majority of cases, opEquals _is_ equivalent to opCmp == 0, and element-wise equality is not. Defining opEquals to be the latter by default _even in the presence of opCmp_ is therefore wrong in almost all cases.
Case-insensitive ordering is a simple example. Field for field comparison is the right default whether or not opCmp is defined.
If you're starting with the premise that the user's intention was to keep the default semantics of opEquals, then yes. I just find this unlikely. To take your example, I for one would fully expect "hello WORLD" and "Hello World" to be considered equal in all circumstances, not just when sorting. If you want it to be restricted to sorting, it should probably not be a property of the type, but a predicate should be used instead for sorting. If you still want it to be different, just define an opEquals, it's trivial using `tupleof`. Indeed, both options make assumptions about the user's intentions, but IMO the assumption for opEquals not using opCmp is the more unlikely one. Maybe we can then at least require the user to specify explicitly what she wants, as H.S. Teoh (?) suggested? Maybe for now implement whatever has been the default before to minimize breakage, but deprecate it with an appropriate message, and remove it a few releases down the road?
Jul 26 2014
prev sibling parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 July 2014 19:48, Andrei Alexandrescu via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 7/26/14, 2:19 AM, "Marc Sch=C3=BCtz" <schuetzm gmx.net>" wrote:

 On Saturday, 26 July 2014 at 07:42:05 UTC, Ola Fosheim Gr=C3=B8stad wrot=
e:
 On Saturday, 26 July 2014 at 06:50:11 UTC, Manu via Digitalmars-d wrote=
:
 It's okay, I hate it too.
 But I equally can't abide =3D=3D meaning something different than <, <=
=3D,
 etc.
 That's insane.
Yes, it is unsound to use opCmp for types that aren't totally ordered:
Yes, that's why it's possible to provide opEquals in addition to opCmp. But for the vast majority of cases, opEquals _is_ equivalent to opCmp =
=3D=3D
 0, and element-wise equality is not. Defining opEquals to be the latter
 by default _even in the presence of opCmp_ is therefore wrong in almost
 all cases.
Case-insensitive ordering is a simple example. Field for field comparison is the right default whether or not opCmp is defined.
...you're trolling me right? Just to be clear, you're saying you think it's reasonable for <, <=3D, >=3D= , > to perform case insensitive comparison for ordering purposes, but for =3D= =3D, !=3D to be case sensitive for equality comparisons? You're saying you think that's what the user *probably* wants, by default? Is there precedent for something like this? I've never seen anything like it. It creates very awkward relationships between the suite of operators which is likely to break down in many logical constructs. I don't understand; your example is the perfect example of why opCmp=3D=3D0 should be the default opEquals, but somehow it's an argument against? I have no idea how to reason about this topic. I come from a place where <,<=3D,=3D=3D,!=3D,>=3D,> are a suite, and it is = reasonable to assume they all work the same. Is that not the default presumption of modern programmers? Is it really so unlikely that people would make the mistake of assuming they are related?
Jul 26 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/26/14, 7:58 PM, Manu via Digitalmars-d wrote:
 On 26 July 2014 19:48, Andrei Alexandrescu via Digitalmars-d
 <digitalmars-d puremagic.com <mailto:digitalmars-d puremagic.com>> wrote:

     On 7/26/14, 2:19 AM, "Marc Schütz" <schuetzm gmx.net
     <mailto:schuetzm gmx.net>>" wrote:

         On Saturday, 26 July 2014 at 07:42:05 UTC, Ola Fosheim Grøstad
         wrote:

             On Saturday, 26 July 2014 at 06:50:11 UTC, Manu via
             Digitalmars-d wrote:

                 It's okay, I hate it too.
                 But I equally can't abide == meaning something different
                 than <, <=,
                 etc.
                 That's insane.


             Yes, it is unsound to use opCmp for types that aren't
             totally ordered:


         Yes, that's why it's possible to provide opEquals in addition to
         opCmp.
         But for the vast majority of cases, opEquals _is_ equivalent to
         opCmp ==
         0, and element-wise equality is not. Defining opEquals to be the
         latter
         by default _even in the presence of opCmp_ is therefore wrong in
         almost
         all cases.


     Case-insensitive ordering is a simple example. Field for field
     comparison is the right default whether or not opCmp is defined.


 ...you're trolling me right?
No.
 Just to be clear, you're saying you think it's reasonable for <, <=, >=,
  > to perform case insensitive comparison for ordering purposes, but for
 ==, != to be case sensitive for equality comparisons?
Odder examples have been shown here in support of various other points. I certainly think it's possible, and it's not a bug by definition.
 You're saying you think that's what the user *probably* wants, by
 default?
Yes.
 Is there precedent for something like this?
In all likelihood.
  I've never seen
 anything like it.
Time to expand one's social circle :o).
 It creates very awkward relationships between the suite of operators
 which is likely to break down in many logical constructs.
Doesn't seem that drastic to me.
 I don't understand; your example is the perfect example of why opCmp==0
 should be the default opEquals, but somehow it's an argument against? I
 have no idea how to reason about this topic..
You yourself seemed to reach for an operator ===. In fact those comparisons you think should exist already exist: what you claim "==" should be is really !(a < b) && !(b < a) or !opCmp(a, b); and what you think "===" should be is really "==".
 I come from a place where <,<=,==,!=,>=,> are a suite, and it is
 reasonable to assume they all work the same.
I think that place is not a good place. That's not a reasonable assumption.
 Is that not the default
 presumption of modern programmers?
No.
 Is it really so unlikely that people
 would make the mistake of assuming they are related?
Yes. Andrei
Jul 26 2014
parent =?UTF-8?Q?Tobias=20M=C3=BCller?= <troplin bluewin.ch> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote: 
 It creates very awkward relationships between the suite of operators
 which is likely to break down in many logical constructs.
Doesn't seem that drastic to me.
 I don't understand; your example is the perfect example of why opCmp==0
 should be the default opEquals, but somehow it's an argument against? I
 have no idea how to reason about this topic..
You yourself seemed to reach for an operator ===. In fact those comparisons you think should exist already exist: what you claim "==" should be is really !(a < b) && !(b < a) or !opCmp(a, b); and what you think "===" should be is really "==".
 I come from a place where <,<=,==,!=,>=,> are a suite, and it is
 reasonable to assume they all work the same.
I think that place is not a good place. That's not a reasonable assumption.
 Is that not the default
 presumption of modern programmers?
No.
 Is it really so unlikely that people
 would make the mistake of assuming they are related?
Yes.
I also strongly disagree with that. opCmp and opEquals should be separate because only one of them may make sense for a type. But if both make sense they should agree. Of course there exist many possible sorting orders for a type, but the standard comparison operators should always be the natural ones. For other orders you have predicate based sorting. As I understand it, that's also Walters argument why breaking code is ok in that case, because if they don't agree, the code was already buggy in the first place. At very least *I* would be surprised inconsistent comparison operators. And if I wasn't, I still wouldn't implement it like that because I'm sure that someone at some time will be confused by it. Tobi
Jul 27 2014
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 06:50, Walter Bright wrote:

 Consider also:

 http://www.reddit.com/r/programming/comments/2bl51j/programming_in_d_a_great_online_book_for_learning/cj75gm9


 The current scheme breaks existing code - code that was formerly correct
 and working.

 AAs don't make sense if the notion of == on members is invalid. AAs
 formerly required opCmp, and yes, opCmp could be constructed to give
 different results for opCmp==0 than ==, but I would expect such an
 object to not be used in an AA, again because it doesn't make sense.

 Using the default generated opEquals for AAs may break code, such as the
 an AA of the structs in the parent post, but it seems unlikely that that
 code was correct anyway in an AA (as it would give erratic results).
The code [1] from the original issue, 13179, does have an opCmp which handles equivalent.
 Kenji's rebuttal https://issues.dlang.org/show_bug.cgi?id=13179#c2 is
 probably the best counter-argument, and I'd go with it if it didn't
 result in code breakage.
I still don't see the problem: 1. If neither opCmp or opEquals are defined, the compiler will automatically generate these and will be used for comparison and equivalent 2. If opEquals is defined, lhs == rhs will be lowered to lhs.opEquals(rhs) 3. If opCmp is defined but no opEquals, lhs == rhs will be lowered to lhs.opCmp(rhs) == 0 4. If opCmp and opEquals is defined, lhs == rhs will be lowered to lhs.opEquals(rhs) The only case this will break code is when opCmp was defined and opEquals was not defined where lhs.opCmp(rhs) == 0 and lhs == rhs gave different results. Many here think this is a bug in the first place. If anything, this change would fix broken code. [1] https://github.com/SiegeLord/Tango-D2/blob/d2port/tango/text/Regex.d#L2345 -- /Jacob Carlborg
Jul 25 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 1:02 AM, Jacob Carlborg wrote:
 3. If opCmp is defined but no opEquals, lhs == rhs will be lowered to
 lhs.opCmp(rhs) == 0
This is the sticking point. opCmp and opEquals are separate on purpose, see Andrei's posts.
Jul 25 2014
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 25 Jul 2014 09:39:11 +0100, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 7/25/2014 1:02 AM, Jacob Carlborg wrote:
 3. If opCmp is defined but no opEquals, lhs == rhs will be lowered to
 lhs.opCmp(rhs) == 0
This is the sticking point. opCmp and opEquals are separate on purpose, see Andrei's posts.
Sure, Andrei makes a valid point .. for a minority of cases. The majority case will be that opEquals and opCmp==0 will agree. In those minority cases where they are intended to disagree the user will have intentionally defined both, to be different. I cannot think of any case where a user will intend for these to be different, then not define both to ensure it. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 25 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 12:10:31PM +0100, Regan Heath via Digitalmars-d wrote:
 On Fri, 25 Jul 2014 09:39:11 +0100, Walter Bright
 <newshound2 digitalmars.com> wrote:
 
On 7/25/2014 1:02 AM, Jacob Carlborg wrote:
3. If opCmp is defined but no opEquals, lhs == rhs will be lowered
to lhs.opCmp(rhs) == 0
This is the sticking point. opCmp and opEquals are separate on purpose, see Andrei's posts.
Sure, Andrei makes a valid point .. for a minority of cases. The majority case will be that opEquals and opCmp==0 will agree. In those minority cases where they are intended to disagree the user will have intentionally defined both, to be different. I cannot think of any case where a user will intend for these to be different, then not define both to ensure it.
[...] Exactly!! I don't understand why people keep bringing up non-linear partial orderings -- those only apply in a *minority* of cases! (Raise your hands if your code depends on non-linear partial orderings. How many of you *require* this more often than linear orderings? Yeah, I thought so.) Why are we sacrificing *common* case -- where opCmp defines a linear ordering -- for the minority case? And it's not like we're making it impossible in the minority case -- if you want a non-linear partial ordering, wouldn't you make sure to define both opCmp and opEquals so that they do the right thing? Since it's an uncommon use case, people would tend to be more careful when implementing it. I argue that it's in the *common* case of linear orderings, where people are more liable to assume (incorrectly, it seems!) that opEquals should default to opCmp()==0 -- and that's the case we should be addressing. Let's not lose sight of the forest for the minority of the trees. T -- The peace of mind---from knowing that viruses which exploit Microsoft system vulnerabilities cannot touch Linux---is priceless. -- Frustrated system administrator.
Jul 25 2014
next sibling parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Friday, 25 July 2014 at 13:44:54 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Fri, Jul 25, 2014 at 12:10:31PM +0100, Regan Heath via 
 Digitalmars-d wrote:
 On Fri, 25 Jul 2014 09:39:11 +0100, Walter Bright
 <newshound2 digitalmars.com> wrote:
 
On 7/25/2014 1:02 AM, Jacob Carlborg wrote:
3. If opCmp is defined but no opEquals, lhs == rhs will be 
lowered
to lhs.opCmp(rhs) == 0
This is the sticking point. opCmp and opEquals are separate on purpose, see Andrei's posts.
Sure, Andrei makes a valid point .. for a minority of cases. The majority case will be that opEquals and opCmp==0 will agree. In those minority cases where they are intended to disagree the user will have intentionally defined both, to be different. I cannot think of any case where a user will intend for these to be different, then not define both to ensure it.
[...] Exactly!! I don't understand why people keep bringing up non-linear partial orderings -- those only apply in a *minority* of cases! (Raise your hands if your code depends on non-linear partial orderings. How many of you *require* this more often than linear orderings? Yeah, I thought so.) Why are we sacrificing *common* case -- where opCmp defines a linear ordering -- for the minority case? And it's not like we're making it impossible in the minority case -- if you want a non-linear partial ordering, wouldn't you make sure to define both opCmp and opEquals so that they do the right thing? Since it's an uncommon use case, people would tend to be more careful when implementing it.
Do I miss something or wouldn't an non-linear ordering imply, that x.opCmp(y) != 0 for all x,y ∈ T and thus automatically generating opEquals to opCmd() == 0 would automatically do the right thing in this case? So the amount of people that require a different opEquals are even smaller and defining opEquals and opCmp for two different orderings is a code smell squared.
Jul 25 2014
parent "Tobias Pankrath" <tobias pankrath.net> writes:
 And it's not like we're making it impossible in the minority 
 case -- if
 you want a non-linear partial ordering, wouldn't you make sure 
 to define
 both opCmp and opEquals so that they do the right thing? Since 
 it's an
 uncommon use case, people would tend to be more careful when
 implementing it.
Do I miss something or wouldn't an non-linear ordering imply, that x.opCmp(y) != 0 for all x,y ∈ T and thus automatically generating opEquals to opCmd() == 0 would automatically do the right thing in this case? So the amount of people that require a different opEquals are even smaller and defining opEquals and opCmp for two different orderings is a code smell squared.
A nevermind, got my hands on a coffee now.
Jul 25 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 25 July 2014 at 13:44:54 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 Exactly!! I don't understand why people keep bringing up 
 non-linear
 partial orderings -- those only apply in a *minority* of cases!
Well, if <, <= etc are only to be used where you have a "natural" total order then I guess you are right, but then opCmp should be limited to those types. Since comparison and boolean operators (&& etc) cannot be defined in a general manner that will work with all branches of mathematics, maybe they should be limited to total orders. It is awfully limiting for DSL-style programming, though. As I've pointed out before, how D limits && and || prevents readable fuzzy logic and opCmp prevents useful fuzzy number operators. So, since the limits are there already, maybe forbidding orders on complex types, and vectors etc is sensible too… (Why limit some types and not others?)
 Since it's an
 uncommon use case, people would tend to be more careful when
 implementing it. I argue that it's in the *common* case of 
 linear orderings, where people are more liable to assume
It is quite common to want an order on domain-specific objects without discriminating all cases, unless you do direct comparison for equality. Say if you have a colour type you might want an order on chromacity, but not on intensity. If you have a vector, you might want an order on magnitude, but not on direction. If you have a normal vector you might want an order on acute angles, e.g. define a<b === is_acute_angle(a,b). If opCmp automatically defines equality, then you have to remember to undefine it. Equality as opCmp would be slow and wrong in the case of "order by magnitude": (a.x*a.x + a.y*a.y) == (b.x*b.x + b.y*b.y) This can go undetected by the programmer if you use a mixin to add "standard operators" or if you don't care about equality in the actual type, but it is used for equality comparison in an aggregate (a struct that has the type as a field). A more sensible approach from a correctness viewpoint is to require equality to be defined explicitly if you provide opCmp. I mean, if you want to argue for CORRECTNESS, take it all the way. Not 50% convinience, 50% correctness, and a dash flexibility, but not quite enough to cover the most pragmatic branches of mathematics…
Jul 25 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 07:21:04PM +0000, via Digitalmars-d wrote:
 On Friday, 25 July 2014 at 13:44:54 UTC, H. S. Teoh via Digitalmars-d wrote:
Exactly!! I don't understand why people keep bringing up non-linear
partial orderings -- those only apply in a *minority* of cases!
Well, if <, <= etc are only to be used where you have a "natural" total order then I guess you are right, but then opCmp should be limited to those types.
No it doesn't have to be. If you want it to work with non-linear orders, just define your own opEquals. We're talking about the *default* behaviour here. Linear orders should be default because they are what people want (and expect) in the majority of cases. Nobody said anything about making it *impossible* to define non-linear orderings. T -- It only takes one twig to burn down a forest.
Jul 25 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 25 July 2014 at 20:13:57 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 Nobody said anything
 about making it *impossible* to define non-linear orderings.
But opCmp() already make it impossible to define some binary relations that are order-like for <, <= etc. You cannot return both -1 and 1 at the same time. And >= is defined a the complement of <. What does 0 signify, does it signify equality or that two values cannot be ordered? Take for instance intervals: Defining [0,1] < [2,3] makes sense, but should [1,2] vs [0,3] give 0, or is that only for [1,2] vs [1,2]? It makes a lot of sense to provide "sortability traits" for the type, such as total-order etc, but then one shouldn't limit the implementation of operators like this without tying it to a trait of some kind. Either the presence of opCmp is a trait of the type, or the trait has to be provided by some other means. Related to some other comments in the thread, some sort algorithms designed for scenarios where you have many similar values need equality for proper or efficient sorting. Like quicksort implementations where you partition into 3 sets (left, pivot-equality, right). On most CPUs you get both less-than and equality information from a single compare so how you implement opCmp and sorting is crucial for performance…
Jul 25 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 4:10 AM, Regan Heath wrote:
 Sure, Andrei makes a valid point .. for a minority of cases.  The majority case
 will be that opEquals and opCmp==0 will agree.  In those minority cases where
 they are intended to disagree the user will have intentionally defined both, to
 be different.  I cannot think of any case where a user will intend for these to
 be different, then not define both to ensure it.
You've agreed with my point, then, that autogenerating opEquals as memberwise equality (not opCmp==0) if one is not supplied will be correct unless the user code is already broken.
Jul 25 2014
next sibling parent Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 July 2014 06:38, Walter Bright via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 7/25/2014 4:10 AM, Regan Heath wrote:

 Sure, Andrei makes a valid point .. for a minority of cases.  The
 majority case
 will be that opEquals and opCmp==0 will agree.  In those minority cases
 where
 they are intended to disagree the user will have intentionally defined
 both, to
 be different.  I cannot think of any case where a user will intend for
 these to
 be different, then not define both to ensure it.
You've agreed with my point, then, that autogenerating opEquals as memberwise equality (not opCmp==0) if one is not supplied will be correct unless the user code is already broken.
No, because there's no obvious reason to define opEquals if you do define opCmp, and the opEq
Jul 25 2014
prev sibling next sibling parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 26 July 2014 13:33, Manu <turkeyman gmail.com> wrote:

 On 26 July 2014 06:38, Walter Bright via Digitalmars-d <
 digitalmars-d puremagic.com> wrote:

 On 7/25/2014 4:10 AM, Regan Heath wrote:

 Sure, Andrei makes a valid point .. for a minority of cases.  The
 majority case
 will be that opEquals and opCmp==0 will agree.  In those minority cases
 where
 they are intended to disagree the user will have intentionally defined
 both, to
 be different.  I cannot think of any case where a user will intend for
 these to
 be different, then not define both to ensure it.
You've agreed with my point, then, that autogenerating opEquals as memberwise equality (not opCmp==0) if one is not supplied will be correct unless the user code is already broken.
No, because there's no obvious reason to define opEquals if you do define opCmp, and the opEq
Oops, sorry! Hit the send hotkey >_< No, because there's no obvious reason to define opEquals if you do define opCmp and the opEquals would be the same. It seems to me, at face value, that opCmp is for full range of comparisons, and opEquals is for unordered types. Surely this is a reasonable conclusion to make? I don't see how you can say that a compiler generated opEquals in the presence of a user opCmp can reliably be correct. It may be correct, if you're lucky, and that's the best offer you'll get. opCmp==0 however is practically certain to be correct, since <= and >= are required to work... and the api embodies the concept of equality, it would be very hard to write an implementation where equal was broken, but <,<=,>=,> all worked.
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 8:41 PM, Manu via Digitalmars-d wrote:
 No, because there's no obvious reason to define opEquals if you do define opCmp
 and the opEquals would be the same.
 It seems to me, at face value, that opCmp is for full range of comparisons, and
 opEquals is for unordered types. Surely this is a reasonable conclusion to
make?

 I don't see how you can say that a compiler generated opEquals in the presence
 of a user opCmp can reliably be correct.
You cannot say that opCmp can reliably be used for ==. Andrei provided a more specific example.
 It may be correct, if you're lucky, and that's the best offer you'll get.
 opCmp==0 however is practically certain to be correct, since <= and >= are
 required to work... and the api embodies the concept of equality, it would be
 very hard to write an implementation where equal was broken, but <,<=,>=,> all
 worked.
At this point, it's obvious we are going around in a circle. You ask the same questions over and over, and I answer them over and over. If you don't want to accept that equality and comparison are fundamentally different operations, I can only repeat saying the same things.
Jul 25 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 09:22:26PM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/25/2014 8:41 PM, Manu via Digitalmars-d wrote:
[...]
It may be correct, if you're lucky, and that's the best offer you'll
get.  opCmp==0 however is practically certain to be correct, since <=
and >= are required to work... and the api embodies the concept of
equality, it would be very hard to write an implementation where
equal was broken, but <,<=,>=,> all worked.
At this point, it's obvious we are going around in a circle. You ask the same questions over and over, and I answer them over and over. If you don't want to accept that equality and comparison are fundamentally different operations, I can only repeat saying the same things.
Well, we can argue about this until the cows come home, but at least for the present regression being addressed, I think Jonathan's fix is the best option (or the least of all evils): revert the compiler change that causes a compile error when the user defines opCmp but not opEquals. In the meantime, I think much of the confusion comes from the current documentation not be adequately clear about the reasoning behind having opCmp and opEquals, so it's too easy to get the wrong impression that defining opCmp is enough to make things work, or to have a fuzzy inaccurate understanding for how opCmp interacts with opEquals, and when/why to use each. I think a documentation PR is in order. T -- I don't trust computers, I've spent too long programming to think that they can get anything right. -- James Miller
Jul 25 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 11:05 PM, H. S. Teoh via Digitalmars-d wrote:
 Well, we can argue about this until the cows come home, but at least for
 the present regression being addressed, I think Jonathan's fix is the
 best option (or the least of all evils): revert the compiler change that
 causes a compile error when the user defines opCmp but not opEquals.
His fix is also what I proposed - we both came to the same conclusion.
 In the meantime, I think much of the confusion comes from the current
 documentation not be adequately clear about the reasoning behind having
 opCmp and opEquals, so it's too easy to get the wrong impression that
 defining opCmp is enough to make things work, or to have a fuzzy
 inaccurate understanding for how opCmp interacts with opEquals, and
 when/why to use each. I think a documentation PR is in order.
I welcome a PR from you on this!
Jul 25 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 11:27:57PM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/25/2014 11:05 PM, H. S. Teoh via Digitalmars-d wrote:
[...]
In the meantime, I think much of the confusion comes from the current
documentation not be adequately clear about the reasoning behind
having opCmp and opEquals, so it's too easy to get the wrong
impression that defining opCmp is enough to make things work, or to
have a fuzzy inaccurate understanding for how opCmp interacts with
opEquals, and when/why to use each. I think a documentation PR is in
order.
I welcome a PR from you on this!
https://github.com/D-Programming-Language/dlang.org/pull/620 T -- Give a man a fish, and he eats once. Teach a man to fish, and he will sit forever.
Jul 26 2014
next sibling parent reply "Fool" <fool dlang.org> writes:
On Sunday, 27 July 2014 at 00:43:40 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 https://github.com/D-Programming-Language/dlang.org/pull/620
Thank you for this. There is still a problem, I think. Defining opEquals only makes sense if a user wants to replace equality by some equivalence relation (different from equality). The user is not forced to define opEquals such that it models an equivalence relation but it hardly makes sense. Similarly, the user is free to define opCmp without restriction. In practice, however, it does not seem to make any sense if <= does not even model a preorder (reflexive and transitive) or one of >, <=, < does not match. It turns out that intuition of many people around here is not at random. Let A be a set and <= a preorder on A. For all a, b in A define ~ such that a ~ b = (a <= b or b <= a). Then ~ is an equivalence relation. (Let me know if you need a proof.) Clearly, it is possible to define different equivalence relations on a set. The same is true for orderings. Now opEquals and opCmp are used to define a default equivalence relation and ordering on a type, respectively. Please excuse my lack of creativity: in presence of opCmp I cannot see a single sensible use case for defining a.opEquals(b) different from a.opCmp(b) == 0. Those examples mentioned before are skewed. If a candidate for opCmp does not match the default equivalence relation == (defined implicitly or explicitly specified using opEquals) it should not be defined at all.
Jul 27 2014
next sibling parent "Fool" <fool dlang.org> writes:
On Sunday, 27 July 2014 at 19:04:09 UTC, Fool wrote:
 define ~ such that a ~ b = (a <= b or b <= a)
^^ and, of course What should I say, I'm a fool... ;-)
Jul 27 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 27 July 2014 at 19:04:09 UTC, Fool wrote:
 If a candidate for opCmp does not match the default equivalence 
 relation == (defined implicitly or explicitly specified using 
 opEquals) it should not be defined at all.
Does this mean that you agree that opCmp should define a total order?
Jul 27 2014
parent reply "Fool" <fool dlang.org> writes:
On Sunday, 27 July 2014 at 20:45:25 UTC, Ola Fosheim Grøstad 
wrote:
 On Sunday, 27 July 2014 at 19:04:09 UTC, Fool wrote:
 If a candidate for opCmp does not match the default 
 equivalence relation == (defined implicitly or explicitly 
 specified using opEquals) it should not be defined at all.
Does this mean that you agree that opCmp should define a total order?
I think that it should be documented to require properties of a strict partial order (irreflexivity and transitivity, and thus asymmetry) and recommended to model a strict weak order such that (stable) sorting is defined.
Jul 27 2014
parent reply "Fool" <fool dlang.org> writes:
I think that some confusion is due to == is regarded as meaning 
equality while it only means equivalence.

Modifying the RGB color example: there is a natural order < on 
the red channel. Given this ordering, two RGB colors are 
equivalent if and only if their red values coincide. In this 
sense, for example, the colors (0, 0, 0) and (0, 255, 255) 
equivalent although they are not equal.
Jul 27 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 27 July 2014 at 21:07:08 UTC, Fool wrote:
 I think that some confusion is due to == is regarded as meaning 
 equality while it only means equivalence.
But I thought it was confirmed in this thread by the language designers that "==" is equality in D, thus "===" in other languages are not needed?
Jul 27 2014
parent "Fool" <fool dlang.org> writes:
On Sunday, 27 July 2014 at 21:16:52 UTC, Ola Fosheim Grøstad 
wrote:
 But I thought it was confirmed in this thread by the language 
 designers that "==" is equality in D, thus "===" in other 
 languages are not needed?
For me two instances of a class located at different memory locations can never be equal since they can be distinguished. They can only be equivalent with respect to some relation. That said, I am not a native speaker, and probably wrong.
Jul 27 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Sun, Jul 27, 2014 at 07:04:08PM +0000, Fool via Digitalmars-d wrote:
 On Sunday, 27 July 2014 at 00:43:40 UTC, H. S. Teoh via Digitalmars-d wrote:
https://github.com/D-Programming-Language/dlang.org/pull/620
Thank you for this. There is still a problem, I think. Defining opEquals only makes sense if a user wants to replace equality by some equivalence relation (different from equality).
Not necessarily. The user type may be implemented in a way where member-wise binary comparison is not the correct implementation of equality. For example, it could be a tree structure implemented by integer indices into a backing array of nodes. There is no way the compiler could know, in general, how to correctly compare two instances of such a structure, since the bit-level representation of two equal objects may be completely different, yet they represent equivalent trees. You're still implementing equality, but it's equality that's not the same as binary equality.
 The user is not forced to define opEquals such that it models an
 equivalence relation but it hardly makes sense.
Well, in theory, if you *really* wanted to mess with the language (or would-be readers of your code), you *could* implement an arbitrary binary boolean predicate in opEquals, like defining a.opEquals(b) to be ackermannFunction(a-17,b) == sqrt(19*zetaFunction(b/PI)). The compiler won't stop you -- and, due to computability theory, it can't, in general, decide if your opEquals satisfies the axioms of an equivalence relation -- but it begs the question, why? If you set out to shoot yourself in the foot, it shouldn't be surprising if you blast your toes off. :-P
 Similarly, the user is free to define opCmp without restriction. In
 practice, however, it does not seem to make any sense if <= does not
 even model a preorder (reflexive and transitive) or one of >, <=, <
 does not match.
The problem with imposing these kinds of restrictions, is that they are generally not enforceable (at least, not without significantly crippling legitimate use cases). At some point, we have to stop babysitting the programmer and trust that he's competent enough to not try to subvert the language to make it do stuff it wasn't intended to do. As somebody once said: Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn We're not talking about Unix here, but the same principle applies.
 It turns out that intuition of many people around here is not at
 random. Let A be a set and <= a preorder on A. For all a, b in A
 define ~ such that a ~ b = (a <= b or b <= a). Then ~ is an
 equivalence relation. (Let me know if you need a proof.)
 
 Clearly, it is possible to define different equivalence relations on a
 set.  The same is true for orderings.
 
 Now opEquals and opCmp are used to define a default equivalence
 relation and ordering on a type, respectively.
 
 Please excuse my lack of creativity: in presence of opCmp I cannot see
 a single sensible use case for defining a.opEquals(b) different from
 a.opCmp(b) == 0.
Floating-point numbers? ;-) One-point compactification of the reals? There are legitimate use cases for this, though admittedly, it's not very common. That's why originally I proposed that opEquals would *default* to opCmp()==0, but the user could override that if need be.
 Those examples mentioned before are skewed.
 
 If a candidate for opCmp does not match the default equivalence
 relation == (defined implicitly or explicitly specified using
 opEquals) it should not be defined at all.
Well, certainly, they have to be consistent, otherwise you will get strange results from your operators. :) But "consistent" may not necessarily be opEquals = (opCmp()==0). T -- Life would be easier if I had the source code. -- YHL
Jul 27 2014
next sibling parent reply "Fool" <fool dlang.org> writes:
On Monday, 28 July 2014 at 00:23:36 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Sun, Jul 27, 2014 at 07:04:08PM +0000, Fool via
 Defining opEquals only makes sense if a user wants to replace 
 equality
 by some equivalence relation (different from equality).
Not necessarily. The user type may be implemented in a way where member-wise binary comparison is not the correct implementation of equality. For example, it could be a tree structure implemented by integer indices into a backing array of nodes. There is no way the compiler could know, in general, how to correctly compare two instances of such a structure, since the bit-level representation of two equal objects may be completely different, yet they represent equivalent trees. You're still implementing equality, but it's equality that's not the same as binary equality.
I think we agree except for a subtle difference in defining equality and equivalence. In my personal language there is a single equality but there are many equivalences.
 The problem with imposing these kinds of restrictions, is that 
 they are
 generally not enforceable (at least, not without significantly 
 crippling
 legitimate use cases). At some point, we have to stop 
 babysitting the
 programmer and trust that he's competent enough to not try to 
 subvert
 the language to make it do stuff it wasn't intended to do. As 
 somebody
 once said:

 	Unix was not designed to stop people from doing stupid things,
 	because that would also stop them from doing clever things.
 	-- Doug Gwyn

 We're not talking about Unix here, but the same principle 
 applies.
I agree.
 Please excuse my lack of creativity: in presence of opCmp I 
 cannot see
 a single sensible use case for defining a.opEquals(b) 
 different from
 a.opCmp(b) == 0.
Floating-point numbers? ;-)
Thank you for pushing me there! It's true. So D has to separate opEquals and opCmp since otherwise a user could not define floating-point 'equality' and 'comparison' himself in the same way as it is defined by the language. I'm convinced know. :-) Thanks!
Jul 27 2014
next sibling parent "Ola Fosheim Gr" <ola.fosheim.grostad+dlang gmail.com> writes:
On Monday, 28 July 2014 at 06:05:03 UTC, Fool wrote:
 So D has to separate opEquals and opCmp since otherwise a user 
 could not define floating-point 'equality' and 'comparison' 
 himself in the same way as it is defined by the language.

 I'm convinced know. :-)
But opCmp does not affect <> and !<>, which is the closest you get to equivalence? Then again NaN is really bottom, not a proper value, but an exceptional state...
Jul 27 2014
prev sibling parent "Don" <x nospam.com> writes:
On Monday, 28 July 2014 at 06:05:03 UTC, Fool wrote:
 On Monday, 28 July 2014 at 00:23:36 UTC, H. S. Teoh via 
 Digitalmars-d wrote:
 On Sun, Jul 27, 2014 at 07:04:08PM +0000, Fool via
 Defining opEquals only makes sense if a user wants to replace 
 equality
 by some equivalence relation (different from equality).
Not necessarily. The user type may be implemented in a way where member-wise binary comparison is not the correct implementation of equality. For example, it could be a tree structure implemented by integer indices into a backing array of nodes. There is no way the compiler could know, in general, how to correctly compare two instances of such a structure, since the bit-level representation of two equal objects may be completely different, yet they represent equivalent trees. You're still implementing equality, but it's equality that's not the same as binary equality.
I think we agree except for a subtle difference in defining equality and equivalence. In my personal language there is a single equality but there are many equivalences.
 The problem with imposing these kinds of restrictions, is that 
 they are
 generally not enforceable (at least, not without significantly 
 crippling
 legitimate use cases). At some point, we have to stop 
 babysitting the
 programmer and trust that he's competent enough to not try to 
 subvert
 the language to make it do stuff it wasn't intended to do. As 
 somebody
 once said:

 	Unix was not designed to stop people from doing stupid things,
 	because that would also stop them from doing clever things.
 	-- Doug Gwyn

 We're not talking about Unix here, but the same principle 
 applies.
I agree.
 Please excuse my lack of creativity: in presence of opCmp I 
 cannot see
 a single sensible use case for defining a.opEquals(b) 
 different from
 a.opCmp(b) == 0.
Floating-point numbers? ;-)
Thank you for pushing me there! It's true. So D has to separate opEquals and opCmp since otherwise a user could not define floating-point 'equality' and 'comparison' himself in the same way as it is defined by the language. I'm convinced know. :-) Thanks!
Be careful, though. The argument that opCmp() and opEquals() are orthogonal is not correct, though. Although they are different concepts, they are closely related. We must have: a == b implies a.opCmp(b) == 0. The converse does not apply though. Otherwise you're abusing operator overloading, like when you define + to mean "reformat hard disk" or something. Suppose we dealt correctly with floating point, including the <>= operators, etc. Then we'd require another overloaded operator. bool unordered(X other) // return true if !(this > other) && !(this < other) Full situation is: opCmp() == 0 implies ( a==b || a.unordered(b) ) This applies to the RGB example, too. If you define opCmp() for a type, then either: (1) opEquals() is the same as opCmp()==0, OR (2) opEquals() is weird, and needs to be explicitly defined. What you're really doing is distinguishing the unordered case from the equal case. IMHO, the ideal solution would be a really smart compiler that can detect violations of (1). At least, it would be fairly simple to add a runtime assert that this.opCmp(this) == 0 for all cases where opEquals is synthesised.
Jul 28 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 28 July 2014 at 00:23:36 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 On Sun, Jul 27, 2014 at 07:04:08PM +0000, Fool via 
 Digitalmars-d wrote:
 Similarly, the user is free to define opCmp without 
 restriction. In
 practice, however, it does not seem to make any sense if <= 
 does not
 even model a preorder (reflexive and transitive) or one of >, 
 <=, <
 does not match.
The problem with imposing these kinds of restrictions, is that they are generally not enforceable (at least, not without significantly crippling legitimate use cases). At some point, we have to stop babysitting the programmer and trust that he's competent enough to not try to subvert the language to make it do stuff it wasn't intended to do.
That's missing the point completely. If the compiler cannot obtain meta information about the properties of relations then you cannot introduce high level optimization and generic programming becomes crippled and a second rate citizen.
Jul 27 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/26/2014 9:46 AM, H. S. Teoh via Digitalmars-d wrote:
 On Fri, Jul 25, 2014 at 11:27:57PM -0700, Walter Bright via Digitalmars-d
wrote:
 I welcome a PR from you on this!
https://github.com/D-Programming-Language/dlang.org/pull/620
Thanks!
Jul 27 2014
prev sibling parent reply "Fool" <fool dlang.org> writes:
I agree that the documentation needs improvement.

It needs to be defined what kind of relation opCmp is meant to 
model.

If it's concept is a partial order, opEquals cannot be inferred.

If it's concept is a strict weak ordering [1, 2], which is 
required for sorting, opEqual can be inferred.

[1] https://www.sgi.com/tech/stl/StrictWeakOrdering.html
[2] 
https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings
Jul 26 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 26 July 2014 at 10:22:54 UTC, Fool wrote:
 I agree that the documentation needs improvement.

 It needs to be defined what kind of relation opCmp is meant to 
 model.

 If it's concept is a partial order, opEquals cannot be inferred.

 If it's concept is a strict weak ordering [1, 2], which is 
 required for sorting, opEqual can be inferred.
I don't think so. NaN < x is false NaN > x is false if you try to derive equality from that you would get: NaN == x is true For sorting you are obviously better off defining NaN as in a manner that is consistent with the other values of the type. E.g. preceding all other float values.
Jul 26 2014
parent reply "Fool" <fool dlang.org> writes:
On Saturday, 26 July 2014 at 13:25:06 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 26 July 2014 at 10:22:54 UTC, Fool wrote:
 It needs to be defined what kind of relation opCmp is meant to 
 model.

 If it's concept is a partial order, opEquals cannot be 
 inferred.

 If it's concept is a strict weak ordering [1, 2], which is 
 required for sorting, opEqual can be inferred.
I don't think so. NaN < x is false NaN > x is false
...which means that < as it is usually defined on floating point numbers does not define a strict weak ordering. This suggests that opCmp should not be required to model a (strict) weak ordering. On the other hand this means that sorting is not defined by opCmp. The last fact is documented at http://dlang.org/phobos/std_algorithm.html#sort
 if you try to derive equality from that you would get:

 NaN == x is true
This is not a contradiction to what I wrote.
 For sorting you are obviously better off defining NaN as in a 
 manner that is consistent with the other values of the type. 
 E.g. preceding all other float values.
...which means that you are extending the partial order to a particular total order.
Jul 26 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
In order to get an overview for myself I've summarized the 
various properties of different types of orders as they are 
described in Wikipedia (right or wrong). I post it here in case 
others have an interest in it (or corrections/extensions to it):

* Preorder:

a ≤ a (reflexivity)

if a ≤ b and b ≤ c then a ≤ c (transitivity)


* Non-strict Partial Order:

a ≤ a (reflexivity);

if a ≤ b and b ≤ c then a ≤ c (transitivity).

if a ≤ b and b ≤ a then a = b (antisymmetry);


* Strict Partial Order:

not a < a (irreflexivity),

if a < b and b < c then a < c (transitivity), and

if a < b then not b < a (asymmetry; implied by irreflexivity and 
transitivity).


* Total Order:

a ≤ b or b ≤ a (totality).

If a ≤ b and b ≤ c then a ≤ c (transitivity);

If a ≤ b and b ≤ a then a = b (antisymmetry);


* Pseudo-Order:

not (a < b and b < a) (antisymmetry)

if a < b then a < c or c < b (co-transivity/comparison)

if not (a < b or b < a) then a = b (equality)


a#b ===  a < b or b < a (apartness/negation of equality)
http://en.wikipedia.org/wiki/Apartness_relation


* Total Preorder:

x ≲ b or b ≲ a (totality).

if a ≲ b and b ≲ c then a ≲ c (transitivity).

x ≲ b (reflexivity; implied by transitivity and totality)


* Strict Weak Order: (complement of a total preorder)

not a < a (irreflexivity).

if a < b and b < c then a < c (transitivity).

if a < b then not b < a (asymmetry; implied by irreflexivity and 
transitivity).

if a is incomparable with y, and b is incomparable with z, then a 
is incomparable with c (transitivity of incomparability).
Jul 27 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 27 July 2014 at 13:06:23 UTC, Ola Fosheim Grøstad 
wrote:
 * Total Preorder:

 x ≲ b or b ≲ a (totality).

 if a ≲ b and b ≲ c then a ≲ c (transitivity).

 x ≲ b (reflexivity; implied by transitivity and totality)
Correction: * Total Preorder: a ≲ b or b ≲ a (totality). if a ≲ b and b ≲ c then a ≲ c (transitivity). a ≲ a (reflexivity; implied by transitivity and totality)
Jul 27 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 26 July 2014 at 16:43:06 UTC, Fool wrote:
 NaN < x is false
 NaN > x is false
...which means that < as it is usually defined on floating point numbers does not define a strict weak ordering.
Are you sure? Properties of a Strict Weak Ordering:
 if you try to derive equality from that you would get:

 NaN == x is true
This is not a contradiction to what I wrote.
Maybe you are right, but it does not match up to what I arrived at based on the properties for Strict Weak Ordering listed in Wikipedia.
Jul 27 2014
parent reply "Fool" <fool dlang.org> writes:
On Sunday, 27 July 2014 at 16:39:01 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 26 July 2014 at 16:43:06 UTC, Fool wrote:
 NaN < x is false
 NaN > x is false
...which means that < as it is usually defined on floating point numbers does not define a strict weak ordering.
Are you sure?
One can define a strict weak ordering using different (but equivalent) sets of axioms. We have NOT (0.0 < NaN) AND NOT (NaN < 0.0) [0.0 and NaN are incomparable] AND NOT (NaN < 1.0) AND NOT (1.0 < NaN) [NaN and 1.0 are incomparable] However, it does NOT hold NOT (0.0 < 1.0) AND NOT (1.0 < 0.0) [0.0 and 1.0 are incomparable] Thus we do not have transitivity of incomparability: "For all x, y, and z, if x is incomparable with y, and y is incomparable with z, then x is incomparable with z." [1] [1] https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings
Jul 27 2014
parent reply "Ola Fosheim Gr" <ola.fosheim.grostad+dlang gmail.com> writes:
On Sunday, 27 July 2014 at 18:00:29 UTC, Fool wrote:
 Thus we do not have transitivity of incomparability:
You are right, I forgot to test the case where only c is NaN. Well, that makes floats suck even more! :-)
Jul 27 2014
parent "Fool" <fool dlang.org> writes:
On Sunday, 27 July 2014 at 18:22:50 UTC, Ola Fosheim Gr wrote:
 On Sunday, 27 July 2014 at 18:00:29 UTC, Fool wrote:
 Thus we do not have transitivity of incomparability:
You are right, I forgot to test the case where only c is NaN. Well, that makes floats suck even more! :-)
In fact, IEEE 754 was designed by brilliant people: 'Non-Extended encodings are all "Lexicographically Ordered", which means that if two floating-point numbers in the same format are ordered (say x < y), then they are ordered the same way when their bits are reinterpreted as Sign-Magnitude integers. Consequently, processors need no floating-point hardware to search, sort and window floating-point arrays quickly. (However, some processors reverse byte-order!)' [1] Is a function for comparing floating-point numbers this way available in Phobos? [1] W. Kahan: "Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic", 1997. URL http://www.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
Jul 27 2014
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Sat, 26 Jul 2014 05:22:26 +0100, Walter Bright  
<newshound2 digitalmars.com> wrote:
 If you don't want to accept that equality and comparison are  
 fundamentally different operations, I can only repeat saying the same  
 things.
For the majority of use cases they are *not* in fact fundamentally different. You're correct, they are *actually* fundamentally different at a conceptual/theoretical level, but this difference is irrelevant in the majority of cases. It is true that we need to be able to define/model this difference (which is why we have both opCmp and opEquals) but it is *not* true that every user, for every object, needs to be aware of and cope with this difference. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 28 2014
prev sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 25 Jul 2014 21:38:33 +0100, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 7/25/2014 4:10 AM, Regan Heath wrote:
 Sure, Andrei makes a valid point .. for a minority of cases.  The  
 majority case
 will be that opEquals and opCmp==0 will agree.  In those minority cases  
 where
 they are intended to disagree the user will have intentionally defined  
 both, to
 be different.  I cannot think of any case where a user will intend for  
 these to
 be different, then not define both to ensure it.
You've agreed with my point, then, that autogenerating opEquals as memberwise equality (not opCmp==0) if one is not supplied will be correct unless the user code is already broken.
No, you've miss-understood my point. My point was that for the vast majority of coders, in the vast majority of cases opCmp()==0 will agree with opEquals(). It is only in very niche cases i.e. where partial ordering is actually present and important, that this assumption should be broken. Yet, by default, if a user defines opCmp() the compiler generated opEquals may well violate that assumption. This is surprising and will lead to subtle bugs. If someone is intentionally defining an object for partial ordering they will expect to have to define both opCmp and opEquals, and not only that, if they somehow neglect to do so their first test of partial ordering will show they have a bug and they will soon realise their mistake. The same cannot be said for someone who wants total ordering (the majority of users in the majority of cases). In this case they are unlikely to specifically test for ordering bugs, and this mistake will creep in cause trouble down the line. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jul 28 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 28 July 2014 at 09:37:27 UTC, Regan Heath wrote:
 My point was that for the vast majority of coders, in the vast 
 majority of cases opCmp()==0 will agree with opEquals().  It is 
 only in very niche cases i.e. where partial ordering is 
 actually present and important, that this assumption should be 
 broken.

 Yet, by default, if a user defines opCmp() the compiler 
 generated opEquals may well violate that assumption.  This is 
 surprising and will lead to subtle bugs.
The cheap non-breaking solution is to just add opCmpTotal() and map opCmp() to that. If opCmpTotal is defined then you cannot define opCmp() and opCmp(a,b)==0 should match a==b whether redefined or not.
Jul 28 2014
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 08:02:18 UTC, Jacob Carlborg wrote:
 1. If neither opCmp or opEquals are defined, the compiler will 
 automatically generate these and will be used for comparison 
 and equivalent

 2. If opEquals is defined, lhs == rhs will be lowered to 
 lhs.opEquals(rhs)

 3. If opCmp is defined but no opEquals, lhs == rhs will be 
 lowered to lhs.opCmp(rhs) == 0

 4. If opCmp and opEquals is defined, lhs == rhs will be lowered 
 to lhs.opEquals(rhs)
The compiler _never_ defines opCmp for you. You have to do that yourself. So, what you're suggesting would force people to define opEquals just because they defined opCmp unless they wanted to take a performance hit. And once you define opEquals, you have to define toHash. So, what you're suggesting would force a lot more code to define toHash, which will likely cause far more bugs than simply requiring that the programmer define opEquals if that's required in order to make it consistent with opEquals. - Jonathan M Davis
Jul 25 2014
next sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Jonathan M Davis"  wrote in message 
news:lzigfacgrlssjuemoqyg forum.dlang.org...

 The compiler _never_ defines opCmp for you. You have to do that yourself. 
 So, what you're suggesting would force people to define opEquals just 
 because they defined opCmp unless they wanted to take a performance hit 
 <<<<<<<< in the rare case that it actually matters >>>>>>>>>>.
 And once you define opEquals, you have to define toHash. So, what you're 
 suggesting would force a lot more code to define toHash, which will likely 
 cause far more bugs than simply requiring that the programmer define 
 opEquals if that's required in order to make it consistent with opEquals. 
Jul 25 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 10:10:50 UTC, Daniel Murphy wrote:
 "Jonathan M Davis"  wrote in message 
 news:lzigfacgrlssjuemoqyg forum.dlang.org...

 The compiler _never_ defines opCmp for you. You have to do 
 that yourself. So, what you're suggesting would force people 
 to define opEquals just because they defined opCmp unless they 
 wanted to take a performance hit <<<<<<<< in the rare case 
 that it actually matters >>>>>>>>>>.
Equality checks are a common operation, so it will affect a fair bit of code. Granted, how much it will really matter is an open question, but there will be a small reduction in speed to quite a bit of code out there. But regardless of whether the efficiency cost is large, you're talking about incurring it just to fix the code of folks who couldn't be bothered to make sure that opEquals and lhs.opCmp(rhs) == 0 were equivalent. You'd be punishing correct code (however slight that punishment may be) in order to fix the code of folks who didn't even properly test basic functionality. I see no reason to care about trying to help out folks who can't even be bothered to test opEquals and opCmp, especially when that help isn't free. - Jonathan M Davis
Jul 25 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 12:39, Jonathan M Davis wrote:

 But regardless of whether the efficiency cost is large, you're talking
 about incurring it just to fix the code of folks who couldn't be
 bothered to make sure that opEquals and lhs.opCmp(rhs) == 0 were
 equivalent. You'd be punishing correct code (however slight that
 punishment may be) in order to fix the code of folks who didn't even
 properly test basic functionality. I see no reason to care about trying
 to help out folks who can't even be bothered to test opEquals and opCmp,
 especially when that help isn't free.
By Walter and Andrei's definition opCmp is not to be used for equivalent, therefor opCmp does never need to be equal to 0. -- /Jacob Carlborg
Jul 25 2014
parent reply Manu via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 25 July 2014 22:06, Jacob Carlborg via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On 25/07/14 12:39, Jonathan M Davis wrote:

  But regardless of whether the efficiency cost is large, you're talking
 about incurring it just to fix the code of folks who couldn't be
 bothered to make sure that opEquals and lhs.opCmp(rhs) == 0 were
 equivalent. You'd be punishing correct code (however slight that
 punishment may be) in order to fix the code of folks who didn't even
 properly test basic functionality. I see no reason to care about trying
 to help out folks who can't even be bothered to test opEquals and opCmp,
 especially when that help isn't free.
By Walter and Andrei's definition opCmp is not to be used for equivalent, therefor opCmp does never need to be equal to 0.
Yes it does, <= and >= are both things that you can type.
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 5:15 AM, Manu via Digitalmars-d wrote:
 On 25 July 2014 22:06, Jacob Carlborg via Digitalmars-d
     By Walter and Andrei's definition opCmp is not to be used for equivalent,
     therefor opCmp does never need to be equal to 0.
 Yes it does, <= and >= are both things that you can type.
Incorrect, as an object may not even have a notion of equality. Nothing requires opCmp to ever return 0. Of course, such an opCmp would never have worked with AAs anyway.
Jul 25 2014
prev sibling next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 25.07.2014 12:07, schrieb Jonathan M Davis:
 And once you define opEquals, you have to define
 toHash. So, what you're suggesting would force a lot more code to define
 toHash, which will likely cause far more bugs than simply requiring that
Is it actually hard to define toHash, or should it be? What is done by default? I guess some magic hash is built over all members of a type (like all members are compared in opEquals). So couldn't there be some templated function that creates the hash for you in the same way as it's done now, but only for the values you want to hash? e.g. hash_t createHash(T...)(T args) { return (do magic with args); } struct Foo { int x; int y; string str; int dontCare; bool opEquals()(auto ref const Foo o) const { return x == o.x && y == o.y && str == o.str; } hash_t toHash() { return createHash(x, y, str); } } Cheers, Daniel
Jul 25 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 10:27:27 UTC, Daniel Gibson wrote:
 Am 25.07.2014 12:07, schrieb Jonathan M Davis:
 And once you define opEquals, you have to define
 toHash. So, what you're suggesting would force a lot more code 
 to define
 toHash, which will likely cause far more bugs than simply 
 requiring that
Is it actually hard to define toHash, or should it be? What is done by default? I guess some magic hash is built over all members of a type (like all members are compared in opEquals). So couldn't there be some templated function that creates the hash for you in the same way as it's done now, but only for the values you want to hash?
Sure. We could create something like that, and we probably should. It would help out in cases where the default wasn't appropriate (e.g. only some of the member variables were part of opEquals). But why force folks to define opEquals and toHash when the defaults would have worked fine for them just to fix the code of folks who didn't make the effort to test that opEquals and lhs.opCmp(rhs) == 0 were equivalent? That seems to me like we're punishing the folks who actually write good code and test it in order to help those who don't even test the basic functionality of their types. - Jonathan M Davis
Jul 25 2014
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 25/07/14 12:07, Jonathan M Davis wrote:

 The compiler _never_ defines opCmp for you. You have to do that
 yourself. So, what you're suggesting would force people to define
 opEquals just because they defined opCmp unless they wanted to take a
 performance hit. And once you define opEquals, you have to define
 toHash. So, what you're suggesting would force a lot more code to define
 toHash, which will likely cause far more bugs than simply requiring that
 the programmer define opEquals if that's required in order to make it
 consistent with opEquals.
Again, you're assuming there will be a performance hit. -- /Jacob Carlborg
Jul 25 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Putting it simply,

1. == uses opEquals. If you don't supply opEquals, the compiler will make one 
for you.

2. AAs use ==. See rule 1.


Easy to understand, easy to explain, easy to document.
Jul 25 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 10:48, Walter Bright wrote:
 Putting it simply,

 1. == uses opEquals. If you don't supply opEquals, the compiler will
 make one for you.

 2. AAs use ==. See rule 1.


 Easy to understand, easy to explain, easy to document.
It's very hard to use D when it constantly changes and breaks code. It's especially annoying reading your comments on reddit that we must stop break code. Then a few days later go an break code. I really hope no one gets false hopes from those comments. -- /Jacob Carlborg
Jul 25 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 2:12 AM, Jacob Carlborg wrote:
 On 25/07/14 10:48, Walter Bright wrote:
 Putting it simply,

 1. == uses opEquals. If you don't supply opEquals, the compiler will
 make one for you.

 2. AAs use ==. See rule 1.


 Easy to understand, easy to explain, easy to document.
It's very hard to use D when it constantly changes and breaks code. It's especially annoying reading your comments on reddit that we must stop break code. Then a few days later go an break code. I really hope no one gets false hopes from those comments.
We went through the likely code breakage from this in this thread, and it's hard to see any non-broken code breaking. It will also fix regression https://issues.dlang.org/show_bug.cgi?id=13179 and stop that breakage.
Jul 25 2014
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 11:37, Walter Bright wrote:

 We went through the likely code breakage from this in this thread, and
 it's hard to see any non-broken code breaking. It will also fix
 regression https://issues.dlang.org/show_bug.cgi?id=13179 and stop that
 breakage.
So opEquals will not be required to be defined if opCmp is defined if it's used as an AA key? -- /Jacob Carlborg
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 5:10 AM, Jacob Carlborg wrote:
 On 25/07/14 11:37, Walter Bright wrote:

 We went through the likely code breakage from this in this thread, and
 it's hard to see any non-broken code breaking. It will also fix
 regression https://issues.dlang.org/show_bug.cgi?id=13179 and stop that
 breakage.
So opEquals will not be required to be defined if opCmp is defined if it's used as an AA key?
Right.
Jul 25 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 02:37:46AM -0700, Walter Bright via Digitalmars-d wrote:
 On 7/25/2014 2:12 AM, Jacob Carlborg wrote:
On 25/07/14 10:48, Walter Bright wrote:
Putting it simply,

1. == uses opEquals. If you don't supply opEquals, the compiler will
make one for you.

2. AAs use ==. See rule 1.


Easy to understand, easy to explain, easy to document.
It's very hard to use D when it constantly changes and breaks code. It's especially annoying reading your comments on reddit that we must stop break code. Then a few days later go an break code. I really hope no one gets false hopes from those comments.
We went through the likely code breakage from this in this thread, and it's hard to see any non-broken code breaking. It will also fix regression https://issues.dlang.org/show_bug.cgi?id=13179 and stop that breakage.
You're missing the fact that: 1) Before we fixed the use of typeinfo.compare in aaA.d, users were *required* to define opCmp in order for their types to be usable as AA keys -- because that's what aaA.d used to compare keys. This applies even if the user type has no meaningful partial ordering -- you still had to define opCmp because otherwise it just plain won't work properly. 2) Because of (1), there's now code out there that defines opCmp for user types just so that they can be used as AA keys. But we're now breaking that by saying "nyah nyah we're now using opEquals for AA keys, so your opCmp don't work no more, sux to be you!".
From the perspective of the user, this can be extremely frustrating:
prior to 2.066, they were told "you must define opCmp otherwise your AA keys won't work properly". So like obedient little lambs they went ahead and did that. And now in 2.066 we're saying "you must define opEquals otherwise your AA keys won't work properly -- and all your opCmp's are now useless 'cos that was wrong in the first place". From the perspective of the user, this seems like unreasonable code breakage -- first they *already* expected in the past that their code should've worked with opEquals, but they were told to use opCmp because AA's were buggy. Yet now we're going back on our word and saying that opCmp was wrong -- the very workaround we recommended in the past -- and that now they need to define opEquals instead, which didn't work before. This gives users the perception that we're constantly going back on our word and breaking prior code and annulling previously recommended workarounds without any warning. The whole reason the opCmp()==0 thing was brought up, was to eliminate this frustration -- give users a nice way to transition into the correct AA design of using opEquals for their keys, instead of just outright breaking past recommendations in their face with no warning. T -- Always remember that you are unique. Just like everybody else. -- despair.com
Jul 25 2014
next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 25 July 2014 at 14:10:11 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 The whole reason the opCmp()==0 thing was brought up, was to 
 eliminate
 this frustration -- give users a nice way to transition into 
 the correct
 AA design of using opEquals for their keys, instead of just 
 outright
 breaking past recommendations in their face with no warning.
Not just that, it's also the right thing to do, independently from AAs. I'm astonished that it doesn't work like that already. When I first read the operator overloading docs, I really liked that in D I don't need to define all the individual comparison operators, but only opCmp. I totally expected this to include opEquals, too, which I thought was just an option that could be used to enhance performance, but which could be safely ignored otherwise. It's really bad if this isn't the case, because then there is nothing telling you that you need an opEquals, it will just silently compile and appear to be working.
Jul 25 2014
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 25.07.2014 18:11, schrieb "Marc Schütz" <schuetzm gmx.net>":
 I'm astonished that it doesn't work like that already. When I first read
 the operator overloading docs, I really liked that in D I don't need to
 define all the individual comparison operators, but only opCmp. I
Well, to be fair the documentation, is pretty explicit about it, the headings are "Overloading == and !=" and "Overloading <, <=, <, and <=". The D1 documentation even had a rationale why there's both opEquals and opCmp, no idea why that was dropped for D2. However, I read about opCmp at some time and in the meantime forgot about the "not for ==" part - but this is probably a problem with my brain (or the long timespan) and not with the documentation. Cheers, Daniel
Jul 25 2014
parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 25 July 2014 at 18:54:15 UTC, Daniel Gibson wrote:
 Am 25.07.2014 18:11, schrieb "Marc Schütz" <schuetzm gmx.net>":
 I'm astonished that it doesn't work like that already. When I 
 first read
 the operator overloading docs, I really liked that in D I 
 don't need to
 define all the individual comparison operators, but only 
 opCmp. I
Well, to be fair the documentation, is pretty explicit about it, the headings are "Overloading == and !=" and "Overloading <, <=, <, and <=".
Whatever the outcome of the discussion will be, it needs to be documented much better. The current documentation doesn't say anything about whether or not, and how opEquals and opCmp relate. I doesn't even mention that they are supposed to be consistent. I'm just afraid that it will not be noticed, because it will be "hidden" in the documentation. If the status quo is kept, you just won't know you've written wrong code, even though the compiler has all the means to tell you.
 The D1 documentation even had a rationale why there's both 
 opEquals and opCmp, no idea why that was dropped for D2.

 However, I read about opCmp at some time and in the meantime 
 forgot about the "not for ==" part - but this is probably a 
 problem with my brain (or the long timespan) and not with the 
 documentation.
Well, you're not the only one :-(
Jul 25 2014
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 07:10:50PM +0000, via Digitalmars-d wrote:
 On Friday, 25 July 2014 at 18:54:15 UTC, Daniel Gibson wrote:
[...]
Well, to be fair the documentation, is pretty explicit about it, the
headings are "Overloading == and !=" and "Overloading <, <=, <, and
<=".
Whatever the outcome of the discussion will be, it needs to be documented much better. The current documentation doesn't say anything about whether or not, and how opEquals and opCmp relate. I doesn't even mention that they are supposed to be consistent. I'm just afraid that it will not be noticed, because it will be "hidden" in the documentation. If the status quo is kept, you just won't know you've written wrong code, even though the compiler has all the means to tell you.
The D1 documentation even had a rationale why there's both opEquals
and opCmp, no idea why that was dropped for D2.

However, I read about opCmp at some time and in the meantime forgot
about the "not for ==" part - but this is probably a problem with my
brain (or the long timespan) and not with the documentation.
Well, you're not the only one :-(
Yeah, we definitely need to improve the docs so that the distinction is clearer. T -- Дерево держится корнями, а человек - друзьями.
Jul 25 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 7:08 AM, H. S. Teoh via Digitalmars-d wrote:
 You're missing the fact that:

 1) Before we fixed the use of typeinfo.compare in aaA.d, users were
 *required* to define opCmp in order for their types to be usable as AA
 keys -- because that's what aaA.d used to compare keys. This applies
 even if the user type has no meaningful partial ordering -- you still
 had to define opCmp because otherwise it just plain won't work properly.
I'm not missing that at all. I wrote that requirement.
 2) Because of (1), there's now code out there that defines opCmp for
 user types just so that they can be used as AA keys. But we're now
 breaking that by saying "nyah nyah we're now using opEquals for AA keys,
 so your opCmp don't work no more, sux to be you!".
No, we're not breaking code, unless the user wrote an opCmp that doesn't produce the same result as ==. This kind of struct would not make sense to use in an AA, and that code is most likely very rare and quite broken.
 From the perspective of the user, this can be extremely frustrating:
 prior to 2.066, they were told "you must define opCmp otherwise your AA
 keys won't work properly". So like obedient little lambs they went ahead
 and did that. And now in 2.066 we're saying "you must define opEquals
 otherwise your AA keys won't work properly
Nope. If the user doesn't write opEquals, the compiler will generate one for him, using the same method as it did for ==.
 The whole reason the opCmp()==0 thing was brought up, was to eliminate
 this frustration -- give users a nice way to transition into the correct
 AA design of using opEquals for their keys, instead of just outright
 breaking past recommendations in their face with no warning.
As I explained to Jacob, such a workaround will introduce subtle problems that are going to be hard to live with.
Jul 25 2014
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 20:28:23 UTC, Walter Bright wrote:
 Nope. If the user doesn't write opEquals, the compiler will 
 generate one for him, using the same method as it did for ==.
Well, that was the case in 2.065, whereas 2.066 currently gives an error if you use a type as a key in an AA without defining opEquals for it. That change needs to be reverted. - Jonathan M Davis
Jul 25 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 2:57 PM, Jonathan M Davis wrote:
 Well, that was the case in 2.065, whereas 2.066 currently gives an error if you
 use a type as a key in an AA without defining opEquals for it. That change
needs
 to be reverted.
Do you remember which PR it was?
Jul 25 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 22:34:41 UTC, Walter Bright wrote:
 On 7/25/2014 2:57 PM, Jonathan M Davis wrote:
 Well, that was the case in 2.065, whereas 2.066 currently 
 gives an error if you
 use a type as a key in an AA without defining opEquals for it. 
 That change needs
 to be reverted.
Do you remember which PR it was?
No idea, unfortunately. The only reason that I even know about it is this thread and https://issues.dlang.org/show_bug.cgi?id=13179 (which discusses basically the same thing as this thread). - Jonathan M Davis
Jul 25 2014
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 09:12:11 UTC, Jacob Carlborg wrote:
 On 25/07/14 10:48, Walter Bright wrote:
 Putting it simply,

 1. == uses opEquals. If you don't supply opEquals, the 
 compiler will
 make one for you.

 2. AAs use ==. See rule 1.


 Easy to understand, easy to explain, easy to document.
It's very hard to use D when it constantly changes and breaks code. It's especially annoying reading your comments on reddit that we must stop break code. Then a few days later go an break code. I really hope no one gets false hopes from those comments.
The _only_ code that would break would be code that's _already_ broken - code that defines opCmp in a way that's inconsistent with the default opEquals and then doesn't define opEquals. I see no reason to worry about making sure that we don't break code that's already broken. - Jonathan M Davis
Jul 25 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 25/07/14 12:09, Jonathan M Davis wrote:

 The _only_ code that would break would be code that's _already_ broken -
 code that defines opCmp in a way that's inconsistent with the default
 opEquals and then doesn't define opEquals. I see no reason to worry
 about making sure that we don't break code that's already broken.
I see no reason why I should define opEquals when opCmp was used for AA keys. You keep ignoring that argument. -- /Jacob Carlborg
Jul 25 2014
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 02:08:55PM +0200, Jacob Carlborg via Digitalmars-d
wrote:
 On 25/07/14 12:09, Jonathan M Davis wrote:
 
The _only_ code that would break would be code that's _already_
broken - code that defines opCmp in a way that's inconsistent with
the default opEquals and then doesn't define opEquals. I see no
reason to worry about making sure that we don't break code that's
already broken.
I see no reason why I should define opEquals when opCmp was used for AA keys. You keep ignoring that argument.
[...] AA's don't care about keys being orderable, all they care about is that keys should have a hash value, and be comparable. It was a mistake to use opCmp for AA keys in the first place. We're now fixing this mistake. The issue at hand is really more of easing the transition from the old buggy design so that we don't break old code where they used to work correctly. T -- Error: Keyboard not attached. Press F1 to continue. -- Yoon Ha Lee, CONLANG
Jul 25 2014
parent Jacob Carlborg <doob me.com> writes:
On 2014-07-25 15:51, H. S. Teoh via Digitalmars-d wrote:

 AA's don't care about keys being orderable, all they care about is that
 keys should have a hash value, and be comparable. It was a mistake to
 use opCmp for AA keys in the first place. We're now fixing this mistake.
I'm responding to Jonathan's claims that types that defined opCmp but not opEquals are broken. -- /Jacob Carlborg
Jul 25 2014
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 12:08:55 UTC, Jacob Carlborg wrote:
 On 25/07/14 12:09, Jonathan M Davis wrote:

 The _only_ code that would break would be code that's 
 _already_ broken -
 code that defines opCmp in a way that's inconsistent with the 
 default
 opEquals and then doesn't define opEquals. I see no reason to 
 worry
 about making sure that we don't break code that's already 
 broken.
I see no reason why I should define opEquals when opCmp was used for AA keys. You keep ignoring that argument.
opEquals will now be used for AA keys, not opCmp. That's why git master generates errors when you have a struct which defines opCmp and not opEquals, and you try and use it as an AA key. It was done on the theory that your opEquals and opCmp might not match (which would be buggy code to begin with, so it would be forcing everyone to change their code just because someone might have gotten their opEquals and opCmp wrong). If we keep the same behavior as 2.065 but still change the AAs to use opEquals, then there's no need to define opEquals unless the type was buggy and defined opCmp in a way that was inconsistent with the default opEquals and then didn't define one which was consistent. The code will continue to work. H.S. Teoh wants to change the default-generated opEquals to be equivalent to lhs.opCmp(rhs) == 0 in the case where opCmp is defined in order to avoid further breaking the code of folks whose code is broken and didn't define opEquals when opCmp didn't match the default. So, if we remove the new check for a user-defined opEquals when opCmp is defined, then you don't have to define opEquals. If we do what H.S. Teoh suggests, then you'll have to define it if you want to avoid the additional checks that opCmp would be doing that opEquals wouldn't do, but if you didn't care, then you wouldn't. If we leave it as it is in git master, then you'd always have to define it if you defined opCmp and wanted to use it as an AA key, and since opCmp was used for AA keys before, that means that _every_ type which didn't define opEquals but was used as an AA key will suddenly have to define opEquals and toHash and will thus now be broken. So, the current situation in git master is the worst all around. - Jonathan M Davis
Jul 25 2014
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-07-25 20:56, Jonathan M Davis wrote:

 opEquals will now be used for AA keys, not opCmp.
Well, yes. But that was not the case when the code was written. In that case it was to correct to defined opCmp.
 That's why git master
 generates errors when you have a struct which defines opCmp and not
 opEquals, and you try and use it as an AA key. It was done on the theory
 that your opEquals and opCmp might not match (which would be buggy code
 to begin with, so it would be forcing everyone to change their code just
 because someone might have gotten their opEquals and opCmp wrong).

 If we keep the same behavior as 2.065 but still change the AAs to use
 opEquals, then there's no need to define opEquals unless the type was
 buggy and defined opCmp in a way that was inconsistent with the default
 opEquals and then didn't define one which was consistent. The code will
 continue to work.

 H.S. Teoh wants to change the default-generated opEquals to be
 equivalent to lhs.opCmp(rhs) == 0 in the case where opCmp is defined in
 order to avoid further breaking the code of folks whose code is broken
 and didn't define opEquals when opCmp didn't match the default.

 So, if we remove the new check for a user-defined opEquals when opCmp is
 defined, then you don't have to define opEquals. If we do what H.S. Teoh
 suggests, then you'll have to define it if you want to avoid the
 additional checks that opCmp would be doing that opEquals wouldn't do,
 but if you didn't care, then you wouldn't. If we leave it as it is in
 git master, then you'd always have to define it if you defined opCmp and
 wanted to use it as an AA key, and since opCmp was used for AA keys
 before, that means that _every_ type which didn't define opEquals but
 was used as an AA key will suddenly have to define opEquals and toHash
 and will thus now be broken. So, the current situation in git master is
 the worst all around.
That's what I'm saying. I don't understand what you're arguing for/against. -- /Jacob Carlborg
Jul 25 2014
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, 25 July 2014 at 20:07:08 UTC, Jacob Carlborg wrote:
 On 2014-07-25 20:56, Jonathan M Davis wrote:

 opEquals will now be used for AA keys, not opCmp.
Well, yes. But that was not the case when the code was written. In that case it was to correct to defined opCmp.
Yes, but opCmp always had to be consistent with opEquals, or the code was broken. If the code was "lucky," and the key type was only ever used with opCmp and never opEquals, then the bug wouldn't have manifested itself, but the odds of that are probably low, and it's a bug regardless.
 That's why git master
 generates errors when you have a struct which defines opCmp 
 and not
 opEquals, and you try and use it as an AA key. It was done on 
 the theory
 that your opEquals and opCmp might not match (which would be 
 buggy code
 to begin with, so it would be forcing everyone to change their 
 code just
 because someone might have gotten their opEquals and opCmp 
 wrong).

 If we keep the same behavior as 2.065 but still change the AAs 
 to use
 opEquals, then there's no need to define opEquals unless the 
 type was
 buggy and defined opCmp in a way that was inconsistent with 
 the default
 opEquals and then didn't define one which was consistent. The 
 code will
 continue to work.

 H.S. Teoh wants to change the default-generated opEquals to be
 equivalent to lhs.opCmp(rhs) == 0 in the case where opCmp is 
 defined in
 order to avoid further breaking the code of folks whose code 
 is broken
 and didn't define opEquals when opCmp didn't match the default.

 So, if we remove the new check for a user-defined opEquals 
 when opCmp is
 defined, then you don't have to define opEquals. If we do what 
 H.S. Teoh
 suggests, then you'll have to define it if you want to avoid 
 the
 additional checks that opCmp would be doing that opEquals 
 wouldn't do,
 but if you didn't care, then you wouldn't. If we leave it as 
 it is in
 git master, then you'd always have to define it if you defined 
 opCmp and
 wanted to use it as an AA key, and since opCmp was used for AA 
 keys
 before, that means that _every_ type which didn't define 
 opEquals but
 was used as an AA key will suddenly have to define opEquals 
 and toHash
 and will thus now be broken. So, the current situation in git 
 master is
 the worst all around.
That's what I'm saying. I don't understand what you're arguing for/against.
I'm arguing for _not_ have the compiler complain if you defined opCmp but not opEquals but to do exactly what it did with 2.065 - which is to automatically define opEquals and toHash for you if you didn't define them. H.S. Teoh is arguing for changing it so that the generated opEquals uses opCmp if it's present, which generally makes the performance of the generated opEquals worse if opCmp was consistent with the default opEquals (which it should have been, since it didn't define its own), and worse, it forces all types used as keys in AAs to define toHash, because then the generated opEquals is no longer consistent with the generated toHash. What I'm arguing for would only break code which failed to define opEquals to be consistent with opCmp - so only code which is already broken. No other code would break. However, the current situation with git master and H.S. Teoh's suggestion would both break existing code. - Jonathan M Davis
Jul 25 2014
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 06:56:40PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
[...]
 So, if we remove the new check for a user-defined opEquals when opCmp
 is defined, then you don't have to define opEquals.
This is even worse, since it may silently introduce runtime breakage. The original type may have had opCmp defined just so it can be used with AA's, and now after the upgrade to 2.066, suddenly the custom opCmp code they wrote specifically to work with AA's doesn't get used anymore, yet the compiler silently accepts the code, and the problem won't be found until runtime.
 If we do what H.S. Teoh suggests, then you'll have to define it if you
 want to avoid the additional checks that opCmp would be doing that
 opEquals wouldn't do, but if you didn't care, then you wouldn't.
Which IMO fits the D motto of being correct first, and performant if you ask for it.
 If we leave it as it is in git master, then you'd always have to
 define it if you defined opCmp and wanted to use it as an AA key, and
 since opCmp was used for AA keys before, that means that _every_ type
 which didn't define opEquals but was used as an AA key will suddenly
 have to define opEquals and toHash and will thus now be broken. So,
 the current situation in git master is the worst all around.
[...] Yes, it's clear that *something* need to be done about the current situation in git master. It's just a question of what. Having said that, though, I do agree that the current situation in git master is the most *pedantically* correct, in the sense that it now properly enforces correct usage of opCmp/opEquals/toHash. If we were starting from scratch, I would definitely vote for the current behaviour. The problem, however, is that we have to deal with the historical baggage of existing code that was written to conform to the old buggy AA design, and we'd like to minimize gratuitous code breakage. The current situation in git master is a very poor way of handling this. If people are opposed to making the default opEquals to be opCmp()==0 (if the user defines opCmp), could it at least be used as a deprecation path for phasing out the old usage of opCmp for AA keys and migrating to opEquals? That is, we introduce opEquals = (opCmp()==0) for 2.066 in order to minimize code breakage, but put a deprecation notice on it, and then remove it in 2.067 (or whatever the release will be given the current deprecation cycle). T -- MASM = Mana Ada Sistem, Man!
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 1:47 PM, H. S. Teoh via Digitalmars-d wrote:
 On Fri, Jul 25, 2014 at 06:56:40PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 [...]
 So, if we remove the new check for a user-defined opEquals when opCmp
 is defined, then you don't have to define opEquals.
This is even worse, since it may silently introduce runtime breakage. The original type may have had opCmp defined just so it can be used with AA's, and now after the upgrade to 2.066, suddenly the custom opCmp code they wrote specifically to work with AA's doesn't get used anymore, yet the compiler silently accepts the code, and the problem won't be found until runtime.
Once again, "The thing is, either this suffers from == behaving differently than AAs, or you've made opEquals superfluous by defining it to be opCmp==0. The latter is a mistake, as Andrei has pointed out, as opCmp may not have a concept of equality, and opEquals may not have a concept of ordering. I.e. it's not just about AAs." What you're defending is code of this sort running without error: S[T] aa; S s; aa[t] = s; assert(s != aa[t]);
 Which IMO fits the D motto of being correct first, and performant if you
 ask for it.
See the above snippet.
Jul 25 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 5:08 AM, Jacob Carlborg wrote:
 I see no reason why I should define opEquals when opCmp was used for AA keys.
 You keep ignoring that argument.
Allow me to repeat: "The thing is, either this suffers from == behaving differently than AAs, or you've made opEquals superfluous by defining it to be opCmp==0. The latter is a mistake, as Andrei has pointed out, as opCmp may not have a concept of equality, and opEquals may not have a concept of ordering." This is the crux of the matter.
Jul 25 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/14, 9:45 AM, H. S. Teoh via Digitalmars-d wrote:
 This morning, I discovered this major WAT in D:

 ----
 struct S {
          int x;
          int y;
          int opCmp(S s) {
                  return x - s.x; // compare only x
          }
 }

 void main() {
          auto s1 = S(1,2);
          auto s2 = S(1,3);
          auto s3 = S(2,1);

          assert(s1 < s3); // OK
          assert(s2 < s3); // OK
          assert(s3 > s1); // OK
          assert(s3 > s2); // OK
          assert(s1 <= s2 && s2 >= s1); // OK
          assert(s1 == s2); // FAIL -- WAT??
 }
 ----

 The reason for this is that the <, <=, >=, > operators are defined in
 terms of opCmp (which, btw, is defined to return 0 when the objects
 being compared are equal), but == is defined in terms of opEquals. When
 opEquals is not defined, it defaults to the built-in compiler
 definition, which is a membership equality test, even if opCmp *is*
 defined, and returns 0 when the objects are equal.

 Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
 says this is the case (unfortunately I'm at work so I can't check my
 copy of TDPL).

 https://issues.dlang.org/show_bug.cgi?id=13179

 :-(
Getting back to the root of it: I don't think this is a WAT. Types may choose to define lax ordering comparisons, such as case-insensitive ordering for sorting purposes. Such comparisons create large equivalence classes. Deeming two objects equal, on the other hand, must be quite a bit more stringent. A WAT would be defining ordering comparison to be case insensitive and then finding stuff in a hashtable that wasn't put there. Clearly for common arithmetic types, !(a < b) && !(b < a) is the same as a == b. However, that's not the case for a variety of types and orderings. The one relationship between the two operators would be that if a == b then !(a < b) && !(b < a). I think we should remove the breakage introduced by requiring opEquals if opCmp is defined. It breaks good code for no good reason. Andrei
Jul 25 2014
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Jul 25, 2014 at 06:44:36PM -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
 I think we should remove the breakage introduced by requiring opEquals
 if opCmp is defined. It breaks good code for no good reason.
[...] Yes, we should revert that. T -- If lightning were to ever strike an orchestra, it'd always hit the conductor first.
Jul 25 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2014 10:40 PM, H. S. Teoh via Digitalmars-d wrote:
 On Fri, Jul 25, 2014 at 06:44:36PM -0700, Andrei Alexandrescu via
Digitalmars-d wrote:
 I think we should remove the breakage introduced by requiring opEquals
 if opCmp is defined. It breaks good code for no good reason.
Yes, we should revert that.
https://github.com/D-Programming-Language/dmd/pull/3813
Jul 25 2014
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/23/2014 06:45 PM, H. S. Teoh via Digitalmars-d wrote:
 This morning, I discovered this major WAT in D:

 ----
 struct S {
          int x;
          int y;
          int opCmp(S s) {
                  return x - s.x; // compare only x
          }
 }
 ...
Not even transitive! void main() { auto a=S(~0<<31); auto b=S(0); auto c=S(1); assert(a<b); // OK assert(b<c); // OK assert(a<c); // FAIL } =P
 ...

 Why isn't "a==b" rewritten as "a.opCmp(b)==0"?? I'm pretty sure TDPL
 says this is the case (unfortunately I'm at work so I can't check my
 copy of TDPL).

 https://issues.dlang.org/show_bug.cgi?id=13179

 :-(
 ...
My 5 cents: There seems to be confusion about whether a.opCmp(b)==0 means that a and b are equal, or that a and b are unordered. 1. Based on the current rewrite rules, making operators <= and >= yield true if opCmp returns 0, a.opCmp(b)==0 means that a and b are equal, and one should follow the following general rules: a < b ⇔ b > a a.opCmp(b)==0 ⇒ a == b a.opCmp(b)<>=0 && a == b ⇒ a.opCmp(b)==0 In particular, if opCmp returns a totally ordered type: a.opCmp(b) ⇔ a == b So, rewriting a == b to a.opCmp(b) would make sense in terms of semantics, given that opCmp returns such an ordered type. 2. If a.opCmp(b)==0 actually means that a and b are unordered, then <= and >= are currently rewritten the wrong way. This is fixed, by eg making: a <= b → a<b||a.opEquals(b), avoiding repeated evaluation of side-effects. To get the current behaviour of <= and >=, one should then use !> and !<. To me, the current situation seems to be that DMD assumes meaning 1 and Phobos assumes meaning 2.
Jul 28 2014