www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A Fresh Look at Comparisons, Take 2

reply "Janice Caron" <caron800 googlemail.com> writes:
This is an improved proposal based on the suggestions of Steven,
Yigal, Bruce and Paul. Basically, I've ditched my original proposal
entirely, and replaced it with this one. It doesn't do exactly what
the previous proposal suggested, but nonetheless, it will still do the
"right" thing, at least in the eyes of the posters to the last thread.


RULE 1:

We have a new function paramter storage class, "explicit", which means
that only objects of the exact compile-time type may be passed in.
(Implicit casting is forbidden, apart from implicit casting to const).
So, for example:

    class A {}
    class B : A {}

    void f(explicit A a)
    {
        writefln("Got an A");
    }

    A a1 = new A;
    f(a1); /* OK */

    A a2 = new B;
    f(a2); /* OK - we only care about the compile-time type */

    B b = new B;
    f(b); /* ERROR - WILL NOT COMPILE */

However, we make an exception for implicit casts to const. Thus:

    void f(explicit const(A) a)
    {
        writefln("Got a const A");
    }

    A a1 = new A;
    f(a1); /* OK */

The new keyword "explicit" can also be applied to the "this" parameter
of member functions, as follows:

    class A
    {
        explicit void f()
        {
            writefln("Got an A");
        }
    }

    class B : A {}

    A a1 = new A;
    a1.f(); /* OK */

    A a2 = new B;
    a2.f(); /* OK - we only care about the compile-time type */

    B b = new B;
    b.f(); /* ERROR - WILL NOT COMPILE */

Observe that explicit member functions cannot be inherited, however,
the inheritance mechanism is able to "jump over" an explict function
when attempting to find a match, in order to match the next
non-explicit function. I didn't explain that very well, did I? Let me
show you what I mean by example:

    class A
    {
        int f() { writefln("In A"); }
    }

    class B : A
    {
        explicit int f() { writefln("In B"); }
    }

    class C : A
    {
    }

    A a = new A;
    B b = new B;
    C c = new C;

    a.f(); // prints "In A"
    b.f(); // prints "In B"
    c.f(); // prints "In A"

Here, the call to c.f() matches no function in C. It cannot match
B.f(), because B.f() is explicit. But it /can/ match A.f(). And so it
does. As far as objects of static type C are concerned. B's explicit
functions might as well not exist.



RULE 2:

Neither opEquals nor opCmp shall be defined in Object.


RULE 3:

When the compiler encounters an equality test, such as

    if (a == b) ...
    if (a != b) ...

and no compile-time match can be found for a.opEquals(b), then the
identity test shall be performed. That is, it shall be as if the
programmer had typed:

    if (a is b) ...
    if (a !is b) ...

Note that this rule is only applicable to classes. Structs which do
not implement opEquals shall be compared for exact bitwise equality,
as is the case currently.


RULE 4:

When the compiler encounters a comparison test other than equality, such as

    if (a < b) ...

and no compile-time match can be found for a.opCmp(b), then all such
comparisons shall be compile-time errors.

Observation: It should be possible for the programmer to test for this
at compile time, by doing, for example:

    static if (is(a < b))
    {
        ...
    }



RULE 5:

No type A may be used as an associtive array key unless a match can be
found for A.opCmp(A).

Observation: It should be possible for the programmer to test for this
at compile time, by doing, for example:

    static if (is(int[A]))
    {
        ...
    }


RULE 6:

Classes which implement opEquals and/or opCmp will be expected to
implement them using the following signatures:

    // For classes
    class C
    {
        explicit bool opEquals(explicit const(C) c) const {...}
        explicit int opCmp(explicit const(C) c) const {...}
    }

    // For structs
    struct S
    {
        explicit bool opEquals(explicit ref const(S) s) const {...}
        explicit int opCmp(explicit ref const(S) s) const {...}
    }

(However, I don't see any way of enforcing this, so this rule
represents recommended good practice).


RULE 7:

The following tokens shall be removed from the lexer, and shall no
longer have meaning in D:

    <>
    <>=
    !<>
    !<>=
    !<
    !<=
    !>
    !>=

Rationale: Every class is now either completely ordered (if opCmp is
supplied), or unordered (if not). If a class is completely ordered,
then <> is the same as !=; !> is the same as <=, and so on.
Conversely, if a class is unordered, then rule 4 makes all such
comparisons compile-time errors. Consequently, there is no longer any
need for these tests.


----

How does all that sound?
Apr 18 2008
next sibling parent reply Henning Hasemann <hhasemann web.de> writes:
 How does all that sound?

I like most of the stuff but a little point still makes me grumble (is this a word?): - You can still not build partially ordered classes at all? (Ie something similar to complex numbers) Is there a rationale for this? - You can still have opCmp say two objects are equal and opEqual say they're not? (or vice versa, respectively) No need to say my first proposal addresses these issues (but leaves out others, though). Henning -- GPG Public Key: http://pgpkeys.pca.dfn.de/pks/lookup?op=get&search=0xDDD6D36D41911851
Apr 18 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Janice Caron wrote:
 On 19/04/2008, Henning Hasemann <hhasemann web.de> wrote:
 Yes, that's true. But remember, partially ordered types cannot be used

> with total ordering), so we've got a problem right there. How do we > tell AAs "Yes, we've implemented opCmp(), but you're not allowed to > use it because it's partial"? By using a syntax different to opCmp ;)

You mean, like I suggested earlier in this thread? ;) class P { bool partialLess(P p) {...} bool partialGreater(P p) {...} }

prepend those methods with "op" and the compiler would only generate the operators you specified under your partially ordered class. if you only define opPartialLess than you can do (a < b) and (a > b) would be an error. I'm not sure if that makes sense or not, but for partially ordered sets, is it possible (mathematically speaking) to have only one of the operators defined? --Yigal
Apr 19 2008
parent Yigal Chripun <yigal100 gmail.com> writes:
Janice Caron wrote:
  is it possible (mathematically speaking) to have only one of the
  operators defined?

Only if you allowed opPartialLess to return three values (true, false, and can't compare). For a /fully/ ordered set, you only need boolean <, because you can derive everything else. i.e. (a > b) == (b < a) (a == b) == (!(a < b || b < a)) etc. but what makes an ordering partial is that there may exist some elements, x and y say, for which x and y cannot be ordered. For such elements, neither (x < y) nor (x > y) have meaning, and both are considered to evaluate to false (because boolean expressions must be either true or false). So you'd either need both opPartialLess() and opPartialGreater() being boolean, or you'd need a single opPartialCmp() which could return an enum { LESS, EQUAL, GREATER, UNORDERED }. Does that make sense?

Thanks for the explanation :) I'd prefer in that case either two boolean operators (the first suggestion) or an opCmp that would throw an exception... (I a purist and don't like enum return types that indicate errors...) in any case I think all those strange !<> type comparisons should be removed from D. they are not intuitive (at least for me). --Yigal
Apr 20 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Henning Hasemann <hhasemann web.de> wrote:
  - You can still not build partially ordered classes at all? (Ie
   something similar to complex numbers) Is there a rationale for this?

Only that Paul said it wasn't necessary, and, on reflection, his logic is reasonable. Basically, I think the reasoning is... what's the point? After all, if c and d are complex numbers, then if (c < d) could easily be rewritten as if (c.im == 0 && d.im == 0 && c.re < d.re)
  - You can still have opCmp say two objects are equal and opEqual say
   they're not? (or vice versa, respectively)

Sure. You can also define opAdd() to do subtraction. Sometimes, you just have to rely on common sense. (Also, there are rare circumstances where you might want to do that deliberately, so that, for example, an AA can contain duplicate "equal" keys). Anyway, I don't really see why we would consider that a problem.
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
I'm not clear on whether one needs partial ordering built into a
programming language. Certainly partially ordered sets exist in
mathematics. For example, consider the set of four points P = { a, b,
c, d }, defined such that

    a < b < d
    a < c < d

but where b cannot be compared with c (a diamond-shaped arrangement).
If D fully supported partially ordered types, then we would have

    (b < c) == false
    (b > c) == false
    (b <> c) == false
    (b == c) == false

and elements of type P could not be AA keys. The question is, does D
need to support this at the level of built-in comparisons? After all,
one could easily bolt that sort of thing on afterwards as a bunch of
plain functions, e.g.

    class P
    {
        bool partialLess(P p) {...}
        bool partialGreater(P p) {...}
    }

for those classes that need it. Library solutions such as
std.algorithm may be able to accept partially ordered sets where
appropriate. But I think that it's such a rare edge-case that having
it built into the language itself is probably unnecessary. After all,
we don't have that now, and I've never seen a complaint about its
absence.
Apr 18 2008
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Janice Caron Wrote:

 How does all that sound?

IMHO, explicit is dangerous. It suffers from all the normal pitfalls of non-virtual function calls. Explicit helps when you have all the raw objects and have to be explicit in how you use them, but when calling functions that accept a base class or using an object factory, it gets more complex. I also think no support comparing floating point numbers is unacceptable. Other than that, I like the rest of the proposal (AKA rules 2-5)
Apr 18 2008
parent reply Jason House <jason.james.house gmail.com> writes:
Janice Caron Wrote:

 On 18/04/2008, Jason House <jason.james.house gmail.com> wrote:
 IMHO, explicit is dangerous.  It suffers from all the normal pitfalls of
non-virtual function calls.  Explicit helps when you have all the raw objects
and have to be explicit in how you use them, but when calling functions that
accept a base class or using an object factory, it gets more complex.

You wouldn't use it in that circumstance. You'd use it when you needed it - i.e. when you desired the behavior it provides. It neatly solves the problem of class A { /* provide opCmp */ } class B : A {} class C : A {} B b = new B; C c = new C; if (b < c) ... doing the wrong thing.

How about this: class A { /* provide opCmp */ } bool foo(A a1, A a2){ return a1<a2; } ... class B : A {} class C : A {} B b = new B; C c = new C; return foo(B,C); I'm sure you could say that foo must now declare its parameters as explicit, but requiring all functions that compare A's to have explicit parameters really starts to kill the whole utility of having a base class to begin with.
Apr 18 2008
parent Jason House <jason.james.house gmail.com> writes:
Janice Caron Wrote:

 On 18/04/2008, Jason House <jason.james.house gmail.com> wrote:
  I'm sure you could say that foo must now declare its parameters as explicit,
but requiring all functions that compare A's to have explicit parameters really
starts to kill the whole utility of having a base class to begin with.

That's a good point, but again, nothing /requires/ explicit parameters. It's like anything else - if you desire the behavior, use it; if you don't, don't. But that is a very good observation, yes. Well spotted.

That's not the only problem, just an example. The types of problems are the same as virtual functions vs. non-virtual functions. I think that if you can't rework the proposal to remove the use of "explicit", then there are likely going to be gotchas floating around somewhere. It seems like proper execution of comparison operators is a lot like the double dispatch problem http://en.wikipedia.org/wiki/Double_dispatch
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Jason House <jason.james.house gmail.com> wrote:
  I also think no support comparing floating point numbers is unacceptable.

All built-in types would obviously continue to be comparable!
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Jason House <jason.james.house gmail.com> wrote:
 IMHO, explicit is dangerous.  It suffers from all the normal pitfalls of
non-virtual function calls.  Explicit helps when you have all the raw objects
and have to be explicit in how you use them, but when calling functions that
accept a base class or using an object factory, it gets more complex.

You wouldn't use it in that circumstance. You'd use it when you needed it - i.e. when you desired the behavior it provides. It neatly solves the problem of class A { /* provide opCmp */ } class B : A {} class C : A {} B b = new B; C c = new C; if (b < c) ... doing the wrong thing.
Apr 18 2008
prev sibling next sibling parent reply Yigal Chripun <yigal100 gmail.com> writes:
Janice Caron wrote:
 This is an improved proposal based on the suggestions of Steven,
 Yigal, Bruce and Paul. Basically, I've ditched my original proposal
 entirely, and replaced it with this one. It doesn't do exactly what
 the previous proposal suggested, but nonetheless, it will still do the
 "right" thing, at least in the eyes of the posters to the last thread.


 RULE 1:

 We have a new function paramter storage class, "explicit", which means
 that only objects of the exact compile-time type may be passed in.
 (Implicit casting is forbidden, apart from implicit casting to const).
 So, for example:

     class A {}
     class B : A {}

     void f(explicit A a)
     {
         writefln("Got an A");
     }

     A a1 = new A;
     f(a1); /* OK */

     A a2 = new B;
     f(a2); /* OK - we only care about the compile-time type */

     B b = new B;
     f(b); /* ERROR - WILL NOT COMPILE */

 However, we make an exception for implicit casts to const. Thus:

     void f(explicit const(A) a)
     {
         writefln("Got a const A");
     }

     A a1 = new A;
     f(a1); /* OK */

 The new keyword "explicit" can also be applied to the "this" parameter
 of member functions, as follows:

     class A
     {
         explicit void f()
         {
             writefln("Got an A");
         }
     }

     class B : A {}

     A a1 = new A;
     a1.f(); /* OK */

     A a2 = new B;
     a2.f(); /* OK - we only care about the compile-time type */

     B b = new B;
     b.f(); /* ERROR - WILL NOT COMPILE */

 Observe that explicit member functions cannot be inherited, however,
 the inheritance mechanism is able to "jump over" an explict function
 when attempting to find a match, in order to match the next
 non-explicit function. I didn't explain that very well, did I? Let me
 show you what I mean by example:

     class A
     {
         int f() { writefln("In A"); }
     }

     class B : A
     {
         explicit int f() { writefln("In B"); }
     }

     class C : A
     {
     }

     A a = new A;
     B b = new B;
     C c = new C;

     a.f(); // prints "In A"
     b.f(); // prints "In B"
     c.f(); // prints "In A"

 Here, the call to c.f() matches no function in C. It cannot match
 B.f(), because B.f() is explicit. But it /can/ match A.f(). And so it
 does. As far as objects of static type C are concerned. B's explicit
 functions might as well not exist.



 RULE 2:

 Neither opEquals nor opCmp shall be defined in Object.


 RULE 3:

 When the compiler encounters an equality test, such as

     if (a == b) ...
     if (a != b) ...

 and no compile-time match can be found for a.opEquals(b), then the
 identity test shall be performed. That is, it shall be as if the
 programmer had typed:

     if (a is b) ...
     if (a !is b) ...

 Note that this rule is only applicable to classes. Structs which do
 not implement opEquals shall be compared for exact bitwise equality,
 as is the case currently.


 RULE 4:

 When the compiler encounters a comparison test other than equality, such as

     if (a < b) ...

 and no compile-time match can be found for a.opCmp(b), then all such
 comparisons shall be compile-time errors.

 Observation: It should be possible for the programmer to test for this
 at compile time, by doing, for example:

     static if (is(a < b))
     {
         ...
     }



 RULE 5:

 No type A may be used as an associtive array key unless a match can be
 found for A.opCmp(A).

 Observation: It should be possible for the programmer to test for this
 at compile time, by doing, for example:

     static if (is(int[A]))
     {
         ...
     }


 RULE 6:

 Classes which implement opEquals and/or opCmp will be expected to
 implement them using the following signatures:

     // For classes
     class C
     {
         explicit bool opEquals(explicit const(C) c) const {...}
         explicit int opCmp(explicit const(C) c) const {...}
     }

     // For structs
     struct S
     {
         explicit bool opEquals(explicit ref const(S) s) const {...}
         explicit int opCmp(explicit ref const(S) s) const {...}
     }

 (However, I don't see any way of enforcing this, so this rule
 represents recommended good practice).


 RULE 7:

 The following tokens shall be removed from the lexer, and shall no
 longer have meaning in D:

     <>
     <>=
     !<>
     !<>=
     !<
     !<=
     !>
     !>=

 Rationale: Every class is now either completely ordered (if opCmp is
 supplied), or unordered (if not). If a class is completely ordered,
 then <> is the same as !=; !> is the same as <=, and so on.
 Conversely, if a class is unordered, then rule 4 makes all such
 comparisons compile-time errors. Consequently, there is no longer any
 need for these tests.


 ----

 How does all that sound?
   

I still have however a few questions: since the above "explicit" suggestion is slightly different from what I thought, i still wonder about the following case: class A { int number; ... } class B: A { string name; ... } A obj = new B; A defines an explicit opCmp/opEquals as suggested above. with the above suggestion the opCmp would accept the above "obj" ot type B since it has static type A. is the above behavior acceptable? I'm not sure. for example, what if my code retrieves A's from a collection, so by comparing only number and not name it can retrieve the wrong instance from the collection(a real example would be a collection of widgets in a GUI toolkit representing all the controls of a screen). the compiler cannot check this at compile time so in order to prevent this the check should happen at run-time and an exception should be thrown on error. another issue is purely syntactical - we seem to add more and more keywords to D, and since IMO "explicit" should add runtime checks (not compile checks as stated in the above suggestion) this "screams" to me as a perfect case for annotations. D really needs a standard facility to add user defined annotations (metadata) to code and instead of adding more keywords to D, the above "explicit" would be a perfect candidate to be provided by the standard library. other annotations that could be also provided by the standard library (I keep using this phrase since I prefer tango over phobos and keep wishing for a merger) are "pure", "thread-safe", even "const" / "invariant". less bloat for D more power to the programmer. the annotation facility Ideally would provide hooks via an API to the compiler so that annotations could be applied ar compile time by the compiler (const) and/or runtime (explicit). one last thing, adding a "comparable" interface/annotation could be useful, so that AA's keys would need to implement this interface. of course, all builtins would be comparable by default [except complex that walter wants to move out of the language anyway....] --Yigal
Apr 18 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Janice Caron wrote:
 <snip>...But I do agree with you about needing a
 mechanism for such things. After all, we don't add a new keyword every
 time we add a function, so why should we need a new keyword for each
 new type-constructor or storage class? Someone else pointed out that
 the   sign is unused in D, and could be used for annotations ... but
 that's an issue for another thread.
   

should be optional as you said. Robert said on a different thread that many good ideas and suggestions get lost in the depth of the NG threads. since I don't know how to use the bugzilla and don't even know where it is, may I ask you to please add this suggestion to there as an enhancement request so it won't be lost and maybe Walter would even agree to add it to D [fingers crossed... :)]. regarding annotations/attributes: should we move this to a new thread and further discuss it, or maybe add it to bugzilla now as well? -- Yigal
Apr 18 2008
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 18/04/2008, Yigal Chripun <yigal100 gmail.com> wrote:
  may I ask you to please
  add this suggestion to there as an enhancement request

Not yet. I'm not sure there's a consensus at this stage. I tend to throw idea out and see how well they float, and there may be problems no one's thought of yet. Let's wait and see how it pans out. But yeah - if it pans out, I can do that eventually.

The discussion has gone back and forth so much that if you do get consensus it's likely to be a consensus of 2 or 3 people. If you want a wider audience I suggest writing up the proposal in its final form. This is not only relevant for this particular thread, but in general if you want to get consensus on these long threads, then you should condense the contents down and make an official proposal out of it. Perhaps you might find these guidelines from Python PEP process useful: http://www.python.org/dev/peps/pep-0001/#what-belongs-in-a-successful-pep Not many people are going to read a big long thread to try to piece together what the proposal actually is. Especially the one person who really matters here, Walter. So if you think it's a super idea then you should write it up in a condensed format. For now you can post it to the NG. Or on a wiki page somewhere. Anything that gives the proposal a specific single-page URL for future reference instead of "check out this thread". Just the 2c of someone who, like Walter, is not likely to re-read the entirety of this thread to see if he agrees or not. --bb
Apr 18 2008
next sibling parent Yigal Chripun <yigal100 gmail.com> writes:
Bill Baxter wrote:
 <snip>
 This is not only relevant for this particular thread, but in general
 if you want to get consensus on these long threads, then you should
 condense the contents down and make an official proposal out of it.

 Perhaps you might find these guidelines from Python PEP process useful:
 http://www.python.org/dev/peps/pep-0001/#what-belongs-in-a-successful-pep

 </snip>

PEPs. D needs an official place on its website where proposals can be added and managed. Also, a similar concept is present in other languages as well - Java for example. -- Yigal
Apr 19 2008
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 19/04/2008, Bill Baxter <dnewsgroup billbaxter.com> wrote:
  Not many people are going to read a big long thread to try to piece
 together what the proposal actually is.  Especially the one person who
 really matters here, Walter.

I get that, but the trouble is, even /I/ don't think it's a great idea at this stage, despite the fact that I suggested it.

Ah, ok. I wasn't getting that. Don't mind me, then. --bb
Apr 19 2008
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Janice Caron" wrote
 Observe that explicit member functions cannot be inherited, however,
 the inheritance mechanism is able to "jump over" an explict function
 when attempting to find a match, in order to match the next
 non-explicit function. I didn't explain that very well, did I? Let me
 show you what I mean by example:

    class A
    {
        int f() { writefln("In A"); }
    }

    class B : A
    {
        explicit int f() { writefln("In B"); }
    }

    class C : A
    {
    }

    A a = new A;
    B b = new B;
    C c = new C;

    a.f(); // prints "In A"
    b.f(); // prints "In B"
    c.f(); // prints "In A"

 Here, the call to c.f() matches no function in C. It cannot match
 B.f(), because B.f() is explicit. But it /can/ match A.f(). And so it
 does. As far as objects of static type C are concerned. B's explicit
 functions might as well not exist.

Huh? I think maybe you meant C to inherit from B? If so I see what you are trying to explain :) I don't really agree that opEquals or opCmp should not be inherited. I think normal inheritance the way it is now is just fine. You need to have opEquals in Object in order to support hashing. The default opCmp should be redefined to throw an exception. An alternative is that opCmp is defined in an interface. Or you could do both. The rationale is that opCmp doesn't have a default. Some objects just cannot be expected to have a well-defined order. -Steve
Apr 18 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Steven Schveighoffer wrote:
 Huh?  I think maybe you meant C to inherit from B?  If so I see what you are 
 trying to explain :)

 I don't really agree that opEquals or opCmp should not be inherited.  I 
 think normal inheritance the way it is now is just fine.
   

We are talking about the D default behavior here and it can be overridden. The programmer can define an opCmp that is not explicit and than all derived classes will inherit that opCmp as well, so your desired behavior is still available, it's just not the default anymore.
 You need to have opEquals in Object in order to support hashing.
   

unless opEquals is defined, the compiler defaults to identity comparison. This is equivalent to having an opEquals in Object that does the same thing (which is the current implementation, I think). I don't care if it's implemented inside the compiler, or via opEquals in Object, what's important is the semantics. i.e. unless there's a user defined opCmp for a class, (a == b) will be true iff (a is b) is true [ for two instances a and b of that class].
 The default opCmp should be redefined to throw an exception.  An alternative 
 is that opCmp is defined in an interface.  Or you could do both.  The 
 rationale is that opCmp doesn't have a default.  Some objects just cannot be 
 expected to have a well-defined order.

 -Steve 


   

previous design errors in similar languages. Java implements that idea of a method in Object that throw by default and that proved to be a very fragile design. Object should provide only the methods that all objects contain, and since not all classes are fully ordered, opCmp should not be in Object. I agree with you that providing a "Comparable" interface is a good design. If D was only OOP like java, than this would have been my favorite solution. indeed, Java contains such an interface and you are expected to implement it if your class is comparable. if i remember correctly, there are two interfaces in Java which facilitate the subtle difference between an opEquals and opCmp comparison. as Janice said, this interface is not strictly necessary with her proposal, however such an interface can help document the fact that a class is comparable, so it still could be added. -- Yigal
Apr 18 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Yigal Chripun" wrote
 Steven Schveighoffer wrote:
 Huh?  I think maybe you meant C to inherit from B?  If so I see what you 
 are
 trying to explain :)

 I don't really agree that opEquals or opCmp should not be inherited.  I
 think normal inheritance the way it is now is just fine.

We are talking about the D default behavior here and it can be overridden. The programmer can define an opCmp that is not explicit and than all derived classes will inherit that opCmp as well, so your desired behavior is still available, it's just not the default anymore.

I think the whole goal of this exercise is to avoid having to take 'Object' as the parameter to opCmp and opEquals. Why can't one just provide opCmp(MySpecificClass) as an overload to opCmp? If the compiler sees it at compile time, it calls the more specific one. Maybe in that case, the default opCmp in Object can do the casting for you: int opCmp(Object o) { /* search through vtable for opCmp that takes most derived compatible type, call it with o*/ /* if no such function exists, throw exception */ } If you want to define a more efficient opCmp, then override opCmp(Object o) that will call just the functions you want. It's not any different than today. I don't know how one would write this in D directly, but it could theoretically be done. Adding a new 'explicit' keyword may also solve this problem, but it also appears to introduce a lot of other problems that I don't think we want to have.
 You need to have opEquals in Object in order to support hashing.

unless opEquals is defined, the compiler defaults to identity comparison. This is equivalent to having an opEquals in Object that does the same thing (which is the current implementation, I think). I don't care if it's implemented inside the compiler, or via opEquals in Object, what's important is the semantics. i.e. unless there's a user defined opCmp for a class, (a == b) will be true iff (a is b) is true [ for two instances a and b of that class].

"unless opEquals is defined, the compiler defaults to identity comparison." This is the current implementation. I'm not sure why we need special handling of it.
 The default opCmp should be redefined to throw an exception.  An 
 alternative
 is that opCmp is defined in an interface.  Or you could do both.  The
 rationale is that opCmp doesn't have a default.  Some objects just cannot 
 be
 expected to have a well-defined order.

 -Steve

previous design errors in similar languages. Java implements that idea of a method in Object that throw by default and that proved to be a very fragile design. Object should provide only the methods that all objects contain, and since not all classes are fully ordered, opCmp should not be in Object. I agree with you that providing a "Comparable" interface is a good design. If D was only OOP like java, than this would have been my favorite solution. indeed, Java contains such an interface and you are expected to implement it if your class is comparable. if i remember correctly, there are two interfaces in Java which facilitate the subtle difference between an opEquals and opCmp comparison. as Janice said, this interface is not strictly necessary with her proposal, however such an interface can help document the fact that a class is comparable, so it still could be added.

My preference is the interface. But I'm willing to accept an 'exception throwing' default (or the 'find the right overload' default as defined above) if that is what Walter wants. This is all I'm saying. -Steve
Apr 18 2008
prev sibling next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 Huh?  I think maybe you meant C to inherit from B?

Oh yeah. That was a typo. Whoops! I should have said class C : B { }
  I don't really agree that opEquals or opCmp should not be inherited.  I
  think normal inheritance the way it is now is just fine.

You do? Why? You obviously understand that if Z is a great-great-great-great-...-grandchild of A, then it will still use A's opCmp() function if not overridden? And you're OK with that?
  You need to have opEquals in Object in order to support hashing.

Please could you elaborate, as I don't understand why this is so?
  The default opCmp should be redefined to throw an exception.

Again, I ask, why? Why detect something at run-time that could be detected at compile time? Moreover, if Z is a great-great-great-great-...-grandchild of A, then it will use A's opCmp() function - not Object's. Overriding opCmp even /once/ changes the defaut for the entire heirarchy from that point down, and at that point, Object's implementation becomes irrelevant.
  An alternative
  is that opCmp is defined in an interface.

Yes, that is an alternative. I had thought of that. But then I realised that orderedness can be detected at compile-time without need for an interface, so it seemed unnecessary.
Apr 18 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Janice Caron" wrote
 On 18/04/2008, Steven Schveighoffer wrote:
  I don't really agree that opEquals or opCmp should not be inherited.  I
  think normal inheritance the way it is now is just fine.

You do? Why? You obviously understand that if Z is a great-great-great-great-...-grandchild of A, then it will still use A's opCmp() function if not overridden? And you're OK with that?

Yes. That is a choice of the great-great-great-...-grandchild of A's author :) By not overriding opEquals or opCmp, he is saying that if you compare it with something else, A's implementation is the one to use. If he wanted to specify that opCmp should be different he would have rewritten opCmp.
  You need to have opEquals in Object in order to support hashing.

Please could you elaborate, as I don't understand why this is so?

A hashtable works by putting objects into buckets based on a hash-code. Usually you mod the hash-code so it fits in an array. At that point, you have to deal with collisions. So a simple method is to use a linked list for each bucket. To look-up that object, you need to traverse the list, finding the object that equals the input. So for example, an int[int], to lookup the int, it first converts to a hash-code, finds the right bucket, then traverses the linked list looking for the int equal to the key. Then it returns the value. So for anything to be a key, it must be able to provide a hash, and you must be able to check for equality.
  The default opCmp should be redefined to throw an exception.

Again, I ask, why? Why detect something at run-time that could be detected at compile time?

I think the better solution is to use the interface. But if that is not acceptable, at the very least, the default implementation should throw an exception.
 Moreover, if Z is a great-great-great-great-...-grandchild of A, then
 it will use A's opCmp() function - not Object's. Overriding opCmp even
 /once/ changes the defaut for the entire heirarchy from that point
 down, and at that point, Object's implementation becomes irrelevant.

The only benefit I see is that you would not have to declare that an object implements the interface. Personally I don't think it's worth it, but I could live with it. The way it is now is completely wrong.
  An alternative
  is that opCmp is defined in an interface.

Yes, that is an alternative. I had thought of that. But then I realised that orderedness can be detected at compile-time without need for an interface, so it seemed unnecessary.

It cannot be detected at compile time: class A { int opCmp(Object o){...} } class B: A { int opCmp(Object o) { throw new UnOrderedException("cannot order B");} } void f(A a, Object o) { a < o; // can you tell at compile time whether a is a B or an A? } -Steve
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Yigal Chripun <yigal100 gmail.com> wrote:
  with the above suggestion the opCmp would accept the above "obj" ot type
  B since it has static type A.
  is the above behavior acceptable? I'm not sure.

Good question.
  for example, what if my code retrieves A's from a collection, so by
  comparing only number and not name it can retrieve the wrong instance
  from the collection(a real example would be a collection of widgets in a
  GUI toolkit representing all the controls of a screen). the compiler
  cannot check this at compile time so in order to prevent this the check
  should happen at run-time and an exception should be thrown on error.

Actually, I don't think that would be a problem, for the following reason: the use of "explicit" would not be compulsory! So in the circumstance you describe, you'd just define an opCmp, and /not/ make it explicit. Then the normal rules of inheritance would apply.
  D really needs a standard facility to add user defined annotations
  (metadata) to code and instead of adding more keywords to D, the above
  "explicit" would be a perfect candidate to be provided by the standard
  library.

Since its purpose is to block inheritance, I'm not completely sure how a library could do that. But I do agree with you about needing a mechanism for such things. After all, we don't add a new keyword every time we add a function, so why should we need a new keyword for each new type-constructor or storage class? Someone else pointed out that the sign is unused in D, and could be used for annotations ... but that's an issue for another thread.
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Jason House <jason.james.house gmail.com> wrote:
  I'm sure you could say that foo must now declare its parameters as explicit,
but requiring all functions that compare A's to have explicit parameters really
starts to kill the whole utility of having a base class to begin with.

That's a good point, but again, nothing /requires/ explicit parameters. It's like anything else - if you desire the behavior, use it; if you don't, don't. But that is a very good observation, yes. Well spotted.
Apr 18 2008
prev sibling next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 Yes.  That is a choice of the great-great-great-...-grandchild of A's author
  :)

Hmmm. I think more bugs are likely to occur as a result of people forgetting to supply opCmp() and having the wrong opCmp() called.
  By not overriding opEquals or opCmp, he is saying that if you compare it
  with something else, A's implementation is the one to use.

...or he might have just forgotten, and it could be a bug.
  If he wanted to
  specify that opCmp should be different he would have rewritten opCmp.

...an /obscure/ bug, no less. One that might be very hard to track down.
  >>  You need to have opEquals in Object in order to support hashing.

 Please could you elaborate, as I don't understand why this is so?

A hashtable works by <snip>

None of which implies that you need opEquals /in Object/. It only implies that you need it in any class that needs to do hashing. Again, this is a perfect case for explicit.
 It cannot be detected at compile time:

  class A
  {
    int opCmp(Object o){...}
  }

  class B: A
  {
    int opCmp(Object o)
    { throw new UnOrderedException("cannot order B");}
  }

  void f(A a, Object o)
  {
    a < o; // can you tell at compile time whether a is a B or an A?
  }

No. On that point, you're right - although the code would look like thus under the new rules: class A { explicit int opCmp(explicit const(A) c) const {...} } class B : A { // no opCmp } void f(A a, A o) { if (a < o)... } Hmmm. I guess it /would/ have to be a runtime exception after all. I stand corrected on that point. (You're also the second person to point out that error, so I guess there's a consensus there). OK, drop the "compile-time" claims! :-)
Apr 18 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Janice Caron" wrote
 On 18/04/2008, Steven Schveighoffer wrote:
 Yes.  That is a choice of the great-great-great-...-grandchild of A's 
 author
  :)

Hmmm. I think more bugs are likely to occur as a result of people forgetting to supply opCmp() and having the wrong opCmp() called.

I disagree. Not overriding opCmp seems to me far more common than overriding it. Many times one inherits from a class to change one bit of it, a bit that may not affect opCmp. Should the compiler assume the user wants to override opCmp, and do some default action for him? Or should you always have to alias the parent's opCmp? I have a project, for instance, where I've made 10 classes. None of those classes define opCmp, and the project works beautifully without it.
  By not overriding opEquals or opCmp, he is saying that if you compare it
  with something else, A's implementation is the one to use.

...or he might have just forgotten, and it could be a bug.

So what? If he forgot to override f() from the base class that could be an error too. There is no way for the compiler to prevent all bugs.
  >>  You need to have opEquals in Object in order to support hashing.

 Please could you elaborate, as I don't understand why this is so?

A hashtable works by <snip>

None of which implies that you need opEquals /in Object/. It only implies that you need it in any class that needs to do hashing. Again, this is a perfect case for explicit.

It is not necessary in Object, but because it CAN be defined by default, why not? This means any object can be a key without any extra code on the author's part. The default opEquals might not be perfect, but it is consistent. The default opCmp is not consistent. -Steve
Apr 18 2008
prev sibling next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Steven Schveighoffer <schveiguy yahoo.com> wrote:
  >>  >>  You need to have opEquals in Object in order to support hashing.
 Please could you elaborate, as I don't understand why this is so?


It is not necessary in Object

Ah! That would explain why I didn't understand why it was so! :-) I am confused about one point though. I know that one can implement an AA using hashing. I also know that one can implement an AA using comparison. However, it seems to me that one does not need /both/. How are D AA's implemented - do they use hashing, or do they use comparison? And why do the docs say we need both?
Apr 18 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Janice Caron" wrote
 On 18/04/2008, Steven Schveighoffer wrote:
  >>  >>  You need to have opEquals in Object in order to support hashing.
 Please could you elaborate, as I don't understand why this is so?


It is not necessary in Object

Ah! That would explain why I didn't understand why it was so! :-) I am confused about one point though. I know that one can implement an AA using hashing. I also know that one can implement an AA using comparison. However, it seems to me that one does not need /both/. How are D AA's implemented - do they use hashing, or do they use comparison? And why do the docs say we need both?

That is a good question... I assume that one should override both opEquals and opCmp because if one doesn't, then if opCmp(x) == 0 can return different results than opEquals(x). I'm guessing that the buckets of the hash are somehow sorted lists, maybe they are RB trees (that seems terribly inefficient) or just sorted arrays? I have no idea. But strictly speaking, opCmp is not required for a basic hash implementation. only opEquals to see if you got the right element or not. I'd be interested to know why opCmp is required. -steve
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 18/04/2008, Yigal Chripun <yigal100 gmail.com> wrote:
  may I ask you to please
  add this suggestion to there as an enhancement request

Not yet. I'm not sure there's a consensus at this stage. I tend to throw idea out and see how well they float, and there may be problems no one's thought of yet. Let's wait and see how it pans out. But yeah - if it pans out, I can do that eventually.
  regarding annotations/attributes: should we move this to a new thread
  and further discuss it

Dunno. Move it to another thread I guess. I'm not sure anyone's got a concrete idea of exactly how that would work yet.
Apr 18 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 19/04/2008, Bill Baxter <dnewsgroup billbaxter.com> wrote:
  Not many people are going to read a big long thread to try to piece
 together what the proposal actually is.  Especially the one person who
 really matters here, Walter.

I get that, but the trouble is, even /I/ don't think it's a great idea at this stage, despite the fact that I suggested it. Partly, that's because it's ugly. You have to admit that struct S { explicit bool opEquals(explicit ref const(S) s) const {...} explicit int opCmp(explicit ref const(S) s) const {...} } is not terrifically "friendly" looking. But the /main/ thing that concerns me is that two posters have pointed out that static-typing isn't good enough. That means, you need run-time typing, and that, in turn, means you need double-dispatch (or multimethods), which in turn means it just plain won't work with the suggested syntax. The simplest workarounds are, in fact /current D/. The simplest workaround for inheritance is to supply a function which throws an exception; the simplest workaround for double-dispatch is to accept an Object parameter and do all the tests at run-time. The things that people /really/ care about (e.g. that opEquals should return bool, that opEquals and opCmp should be const) have already been complained about in bugzilla. In summary, enough objections to the other stuff have been brought up on this thread to make me think it's not workable. At least, not yet. In order to be workable, we must first solve (1) annotations, and (2) multiple dispatch. And those are separate subjects.
 So if you think it's a super idea then you
 should write it up in a condensed format.

Of course. But I don't any more, and I wouldn't have learned that if we hadn't discussed it here first.
Apr 18 2008
prev sibling next sibling parent Henning Hasemann <hhasemann web.de> writes:
"Janice Caron" <caron800 googlemail.com> wrote:
 On 18/04/2008, Henning Hasemann <hhasemann web.de> wrote:
  - You can still not build partially ordered classes at all? (Ie
   something similar to complex numbers) Is there a rationale for
 this?

Only that Paul said it wasn't necessary, and, on reflection, his logic is reasonable. Basically, I think the reasoning is... what's the point? After all, if c and d are complex numbers, then if (c < d) could easily be rewritten as if (c.im == 0 && d.im == 0 && c.re < d.re)

so 2i >= 5 would hold? The point is that you might want to create structures that are more complex and not that linear ordered. Think of sets or something. Naturally one can get around this by building up just "normal methods" for comparsion, but there might be casess where its just more beautiful to do it with comparsion operators.
  - You can still have opCmp say two objects are equal and opEqual
 say they're not? (or vice versa, respectively)

Sure. You can also define opAdd() to do subtraction. Sometimes, you just have to rely on common sense.

Sure but then a + b is (for a and b always your type) always substraction. I don't really see a problem in that you can do stuff which doesnt make much sense, I just want to say that this shows that the current notation is not only redundant (regarding ==) but also not as powerful as it could be (regarding partial ordering).
 (Also, there are rare circumstances where you might want to do that
 deliberately, so that, for example, an AA can contain duplicate
 "equal" keys).

Well I think circumstances where you want partial ordering are more common ;) Henning -- GPG Public Key: http://pgpkeys.pca.dfn.de/pks/lookup?op=get&search=0xDDD6D36D41911851
Apr 19 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 19/04/2008, Henning Hasemann <hhasemann web.de> wrote:
  >     if (c.im == 0 && d.im == 0 && c.re < d.re)

 so 2i >= 5 would hold?

No. That assertion is false. 2i < 5 is false 2i == 5 is false 2i > 5 is false In general, for complex c and d, if you want to less for >=, you would do if (c.im == 0 && d.im == 0 && c.re >= d.re) Basically the rule is, if both numbers are not completely real, then they are not comparable, and > and < will both always yield false. (And I'm talking math here - implementation in a programming language might be different - e.g. you might throw an exception instead of returning false, or you might decide to make them incomparable regardless of the real subset).
  The point is that you might want to create structures that are more
  complex and not that linear ordered.

I assume you mean "partially ordered"? Yes, that's true. But remember, partially ordered types cannot be used as AA keys (at least, not if the AA is implemented as a binary tree with total ordering), so we've got a problem right there. How do we tell AAs "Yes, we've implemented opCmp(), but you're not allowed to use it because it's partial"?
Apr 19 2008
prev sibling next sibling parent Henning Hasemann <hhasemann web.de> writes:
 Yes, that's true. But remember, partially ordered types cannot be used
 as AA keys (at least, not if the AA is implemented as a binary tree
 with total ordering), so we've got a problem right there. How do we
 tell AAs "Yes, we've implemented opCmp(), but you're not allowed to
 use it because it's partial"?

By using a syntax different to opCmp ;) -- GPG Public Key: http://pgpkeys.pca.dfn.de/pks/lookup?op=get&search=0xDDD6D36D41911851
Apr 19 2008
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 19/04/2008, Henning Hasemann <hhasemann web.de> wrote:
 Yes, that's true. But remember, partially ordered types cannot be used

> with total ordering), so we've got a problem right there. How do we > tell AAs "Yes, we've implemented opCmp(), but you're not allowed to > use it because it's partial"? By using a syntax different to opCmp ;)

You mean, like I suggested earlier in this thread? ;) class P { bool partialLess(P p) {...} bool partialGreater(P p) {...} }
Apr 19 2008
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
  is it possible (mathematically speaking) to have only one of the
  operators defined?

Only if you allowed opPartialLess to return three values (true, false, and can't compare). For a /fully/ ordered set, you only need boolean <, because you can derive everything else. i.e. (a > b) == (b < a) (a == b) == (!(a < b || b < a)) etc. but what makes an ordering partial is that there may exist some elements, x and y say, for which x and y cannot be ordered. For such elements, neither (x < y) nor (x > y) have meaning, and both are considered to evaluate to false (because boolean expressions must be either true or false). So you'd either need both opPartialLess() and opPartialGreater() being boolean, or you'd need a single opPartialCmp() which could return an enum { LESS, EQUAL, GREATER, UNORDERED }. Does that make sense?
Apr 19 2008