www.digitalmars.com         C & C++   DMDScript  

D - split up cmp?

reply Hauke Duden <H.NS.Duden gmx.net> writes:
I recently thought about the way D handles operator overloading and I 
noticed that the current implementation will not work well in a 
mathematical context.

In math you often work with sets of objects that are only partially 
ordered. You may be able to determine that a < b or b > a, but if none 
of these two things apply it doesn't necessarily mean that a==b.

For example, consider the set of integer tuples.

The following definitions apply:
(x1,y1) < (x2,y2) <=> x1<x2 && y1<y2
(x1,y1) > (x2,y2) <=> x1>x2 && y1>y2

The problem now is that you cannot use D's operator overloading for such 
tuples because there is no way to implement cmp properly.

Example:

You have the following code:

Tuple t1(2,2);
Tuple t2(1,3);

if( t1 < t2 )
	doStuff();

The compiler translates the comparison to a call of t1.cmp(t2).

But how to implement cmp for integer tuples?
- t1 is not < t2 since 3>2
- t2 is not > t2 since 1<2
- but t1 is also != t2 !!

There is not return value defined for such cases.

I really think it is necessary to have separate functions for <, <=, >, 
= comparisons. That way also partially ordered objects could be 

without having to state that "t1 >= t2" or "t1==t2". Hauke
Sep 18 2003
next sibling parent reply "Daniel Yokomiso" <daniel_yokomiso yahoo.com.br> writes:
"Hauke Duden" <H.NS.Duden gmx.net> escreveu na mensagem
news:bkbvpd$2bld$1 digitaldaemon.com...
 I recently thought about the way D handles operator overloading and I
 noticed that the current implementation will not work well in a
 mathematical context.

 In math you often work with sets of objects that are only partially
 ordered. You may be able to determine that a < b or b > a, but if none
 of these two things apply it doesn't necessarily mean that a==b.

 For example, consider the set of integer tuples.

 The following definitions apply:
 (x1,y1) < (x2,y2) <=> x1<x2 && y1<y2
 (x1,y1) > (x2,y2) <=> x1>x2 && y1>y2

 The problem now is that you cannot use D's operator overloading for such
 tuples because there is no way to implement cmp properly.

 Example:

 You have the following code:

 Tuple t1(2,2);
 Tuple t2(1,3);

 if( t1 < t2 )
 doStuff();

 The compiler translates the comparison to a call of t1.cmp(t2).

 But how to implement cmp for integer tuples?
 - t1 is not < t2 since 3>2
 - t2 is not > t2 since 1<2
 - but t1 is also != t2 !!

 There is not return value defined for such cases.

 I really think it is necessary to have separate functions for <, <=, >,
  >= comparisons. That way also partially ordered objects could be
 implemented, since then it would be possible to express "t1 is not < t2"
 without having to state that "t1 >= t2" or "t1==t2".

 Hauke

"cmp" expresses total ordering, which is a relationship between the ordering operators and the comparison operator. Most people expect total ordering, IME it's the most common case. If D implemented both total ordering and partial ordering, a possible solution: interface PartiallyOrdered { bool isLesser(PartiallyOrdered other); bool isGreater(PartiallyOrdered other); } interface TotallyOrdered : PartiallyOrdered { bool isLesser(TotallyOrdered other) out (result) { if (!result) { assert(this == other || this.isGreater(other)); } }; bool isGreater(TotallyOrdered other) out (result) { if (!result) { assert(this == other || this.isLesser(other)); } }; } we should need to decide between these classes to model our classes, but people should need to be specially attent to not do things like: if (a <= b) { printf("a is not greater than b\r\n"); } else { printf("a is greater than b\r\n"); } because the expected semantics would be wrong. Partial ordering is a nice concept but IMO people wouldn't expect it. There are lots of things people expect from mathematical operators, for example "+": nobody thinks that it shouldn't be associative and commutative, but it can. Best regards, Daniel Yokomiso. "First they ignore you, then they laugh at you, then they fight you, then you win." - Mohandas Gandi --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.518 / Virus Database: 316 - Release Date: 12/9/2003
Sep 18 2003
parent reply Hauke Duden <H.NS.Duden gmx.net> writes:
Daniel Yokomiso wrote:
 "cmp" expresses total ordering, which is a relationship between the ordering
 operators and the comparison operator. Most people expect total ordering,
 IME it's the most common case.

Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators. Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement: If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value. Hauke
Sep 18 2003
next sibling parent reply "Philippe Mori" <philippe_mori hotmail.com> writes:
 Daniel Yokomiso wrote:
 "cmp" expresses total ordering, which is a relationship between the


 operators and the comparison operator. Most people expect total


 IME it's the most common case.

Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators. Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement: If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value. Hauke

I already think that cmp might be inefficient in some cases... If a fourth state is added, it will only be worst... because the compiler would have to check that at every comparison and typically we won't knows what to do if the result is undefined (in some case all comparison can returns false, in other throwing an exception would be preferable) So we would also need that cmp function could returns different classes according to the comparison... or have 2 cmp functions so that the compiler would know if there is an undefined state and what should we do in such a case (I think we should throw an exception!) In fact, if you throw an exception for your undefined cases in Tuple sample then default cmp would be adequate and no language change would be required... and if performance really matters (and exception are not a solution), you could uses your own comparison function... and this would have the benefit that it will better "document" the code... Another case where undefined exist is when comparing floatting points from what I understand because of case like NAN. I think an exception should be normally thrown but in some case, we need to be able to have a totally orderred answer and in those case, NAN should probably be just before -INF. Another case where undefined could be interesting if for mixed (class) type comparison... and in most case I think the best think would be to throw an exception... I think we should be able to implement some operators and have the compiler provide others... If we provide < and =, the compiler should be able to provide all others and cmp. If we provide cmp, the compiler can provide <, <=, ==, !=, > and >=. We can provide the operator we want... the compiler choose the better one... Also we should be able to specialize generic algorithm on weither cmp or <,= is defined...
Sep 18 2003
parent reply Hauke Duden <H.NS.Duden gmx.net> writes:
Philippe Mori wrote:
 I already think that cmp might be inefficient in some cases... If a fourth
 state is added, it will only be worst... because the compiler would have
 to check that at every comparison and typically we won't knows
 what to do if the result is undefined (in some case all comparison can
 returns false, in other throwing an exception would be preferable)

You shouldn't throw an exception in cmp if the relationship is not defined, because testing for e.g. < is a valid operation. Even if the relationship is not defined, it is still certain that it is not <. So the operator should simply return false. I think exceptions should only ever be thrown if the two objects are not not of compatible types. And maybe not even then. About the performance thing: I think a fourth result value wouldn't add any overhead at all. If you define the result as a bit mask (with convenient constants, of course) everything can be checked with a single compare. e.g. enum { UNDEFINED=0 SMALLER=1 EQUAL=2 BIGGER=4 }; Then the < operator could be implemented as return (cmp() == SMALLER); and the <= operator as return (cmp() & (SMALLER | EQUAL)); and so on. Hauke
Sep 18 2003
parent reply "Philippe Mori" <philippe_mori hotmail.com> writes:
 You shouldn't throw an exception in cmp if the relationship is not
 defined, because testing for e.g. < is a valid operation. Even if the
 relationship is not defined, it is still certain that it is not <. So
 the operator should simply return false.

But what will happen in case where everything must be ordered (like it is the case with C++ STL std::less when used for a std::set or std::map for example)? Adding an undefined would cause the undefined item to appears anywhere and the container would be invalid... So anytime someone want to add an item in a sorted container, undefined order is not acceptable... So IMO, we should not support that forth state for any user defined comparison...
 I think exceptions should only ever be thrown if the two objects are not
 not of compatible types. And maybe not even then.

It should be possible to prevent unintended comparison and IMO by default we should not allows any mixed type comparison...
 About the performance thing: I think a fourth result value wouldn't add
 any overhead at all.

And what do you do when that forth state is not acceptable (IMO most of the time)? One case where there would be overhead is in a case like that: if (a < b) { } else if (a ==b) { } else if (a > b) { } else { } If no undefined state is possible, the compiler can easily optimize the third test away and remove unreacheable code in the last else... IMO, undefined is a hole in the language as it does not allows to be able to ensure that something is totally ordered. The only solution that I see is to have 2 posibilities: enum CmpResult { smaller = -1, equal, greater }; enum CmpResultWithUndefined { undefined = -2, smaller, equal, greater }; CmpResult cmp(A a1, A a2) // Undefined not supported CmpResultWithUndefined cmp(A a1, A a2) // Undefined supported And it would be not possible when cmp is virtual to changes the result type to add it but only to remove it...
 If you define the result as a bit mask (with convenient constants, of
 course) everything can be checked with a single compare.

 e.g.

 enum
 {
 UNDEFINED=0
 SMALLER=1
 EQUAL=2
 BIGGER=4
 };

 Then the < operator could be implemented as
 return (cmp() == SMALLER);

 and the <= operator as
 return (cmp() & (SMALLER | EQUAL));

 and so on.

 Hauke

Sep 18 2003
next sibling parent "Sean L. Palmer" <palmer.sean verizon.net> writes:
Check out std::multimap in C++ STL.

Some would want to #define LARGER BIGGER

Maybe there should be a more direct way of testing and setting flags than &
and | operators.  Either some kind of test, set, and reset operators or
maybe bitfields.  Or some kind of special flags type that's alot like an
enum.

Sean

"Philippe Mori" <philippe_mori hotmail.com> wrote in message
news:bkdccj$18fj$1 digitaldaemon.com...
 You shouldn't throw an exception in cmp if the relationship is not
 defined, because testing for e.g. < is a valid operation. Even if the
 relationship is not defined, it is still certain that it is not <. So
 the operator should simply return false.

But what will happen in case where everything must be ordered (like it is the case with C++ STL std::less when used for a std::set or std::map for example)? Adding an undefined would cause the undefined item to appears anywhere and the container would be invalid... So anytime someone want to add an item in a sorted container, undefined order is not acceptable... So IMO, we should not support that forth state for any user defined comparison...
 I think exceptions should only ever be thrown if the two objects are not
 not of compatible types. And maybe not even then.

It should be possible to prevent unintended comparison and IMO by default we should not allows any mixed type comparison...
 About the performance thing: I think a fourth result value wouldn't add
 any overhead at all.

And what do you do when that forth state is not acceptable (IMO most of the time)? One case where there would be overhead is in a case like that: if (a < b) { } else if (a ==b) { } else if (a > b) { } else { } If no undefined state is possible, the compiler can easily optimize the third test away and remove unreacheable code in the last else... IMO, undefined is a hole in the language as it does not allows to be able to ensure that something is totally ordered. The only solution that I see is to have 2 posibilities: enum CmpResult { smaller = -1, equal, greater }; enum CmpResultWithUndefined { undefined = -2, smaller, equal, greater }; CmpResult cmp(A a1, A a2) // Undefined not supported CmpResultWithUndefined cmp(A a1, A a2) // Undefined supported And it would be not possible when cmp is virtual to changes the result type to add it but only to remove it...
 If you define the result as a bit mask (with convenient constants, of
 course) everything can be checked with a single compare.

 e.g.

 enum
 {
 UNDEFINED=0
 SMALLER=1
 EQUAL=2
 BIGGER=4
 };

 Then the < operator could be implemented as
 return (cmp() == SMALLER);

 and the <= operator as
 return (cmp() & (SMALLER | EQUAL));

 and so on.

 Hauke


Sep 19 2003
prev sibling parent reply Hauke Duden <H.NS.Duden gmx.net> writes:
Philippe Mori wrote:
 But what will happen in case where everything must be ordered (like
 it is the case with C++ STL std::less when used for a std::set or
 std::map for example)? Adding an undefined would cause the
 undefined item to appears anywhere and the container would be
 invalid... So anytime someone want to add an item in a sorted
 container, undefined order is not acceptable...

I believe that is not really an issue. If a container requires total ordering then you cannot use it with partially ordered objects or else the results are undefined. It is up to the programmer to make sure that an object complies with the requirements of storage classes. For example, I could easily implement a class whose < operator always returns false. That would probably have an adverse effect on container classes. However, does that mean that the language has to have a rule that makes it impossible to return false from a < operator? It is up to the programmer to ensure that the components he uses work together. That is not the job of the language.
About the performance thing: I think a fourth result value wouldn't add
any overhead at all.

And what do you do when that forth state is not acceptable (IMO most of the time)?

What do you mean? It is not a problem at all to implement the comparison operators with an undefined state (see my previous post).
 One case where there would be overhead is in a case like that:
 
 if (a < b)
 {
 }
 else if (a ==b)
 {
 }
 else if (a > b)
 {
 }
 else
 {
 }
 
 If no undefined state is possible, the compiler can easily optimize the
 third test away and remove unreacheable code in the last else...

If you write code like that then obviously you think that you are dealing with partially ordered objects. Otherwise, what would you write in that last else-block?
 IMO, undefined is a hole in the language as it does not allows to be
 able to ensure that something is totally ordered. The only solution
 that I see is to have 2 posibilities:

Why does total ordering have to be a requirement? It is a fact that not all kinds of objects can have a total ordering. Hauke
Sep 19 2003
parent "Philippe Mori" <philippe_mori hotmail.com> writes:
 But what will happen in case where everything must be ordered (like
 it is the case with C++ STL std::less when used for a std::set or
 std::map for example)? Adding an undefined would cause the
 undefined item to appears anywhere and the container would be
 invalid... So anytime someone want to add an item in a sorted
 container, undefined order is not acceptable...

I believe that is not really an issue. If a container requires total ordering then you cannot use it with partially ordered objects or else the results are undefined. It is up to the programmer to make sure that an object complies with the requirements of storage classes. For example, I could easily implement a class whose < operator always returns false. That would probably have an adverse effect on container classes. However, does that mean that the language has to have a rule that makes it impossible to return false from a < operator? It is up to the programmer to ensure that the components he uses work together. That is not the job of the language.

I do not completly agree with that.. In a language that support design by contract, I think we should be able to prevent the instanciation of a class that would not satisfy one container or algorithm requirement if they can be expressed... So IMO, we should be able to know from any type which type of comparison are supported: a) no comparison at all. b) equality comparison only (equal or not equal, no ordering) c) fully ordered comparisons. d) partially ordered comparisons. Knowing that, it is possible to write a map style container that would be optimized but works properly for any types that support comparison: a) Error at instanciation - we cannot compare the key b) Slow lineary search c) Fast binary search d) This might be a compromize between b and c as we would probably be able to uses fast search if defined and equality comparison otherwise...
About the performance thing: I think a fourth result value wouldn't add
any overhead at all.

And what do you do when that forth state is not acceptable (IMO most of the time)?

What do you mean? It is not a problem at all to implement the comparison operators with an undefined state (see my previous post).

If I code a container that require fully ordered comparison, I would not be able to detect thios a compile-time but only at run-time by an additional test... while currently it is generally assumes that we do not have an undefined state...
 One case where there would be overhead is in a case like that:

 if (a < b)
 {
 }
 else if (a ==b)
 {
 }
 else if (a > b)
 {
 }
 else
 {
 }

 If no undefined state is possible, the compiler can easily optimize the
 third test away and remove unreacheable code in the last else...

If you write code like that then obviously you think that you are dealing with partially ordered objects. Otherwise, what would you write in that last else-block?

The point is that with the addition of the undefined state, the compiler can do less validation... but in fact this is not really true since the compiler knows predefined types (like int) and can still warn in such case and for user type it might even be able to validate it in some case.. and if not (as it would be the case in C++ with the defiition is not visible), the compiler would not have known anyway... but might still warn if it assumes for warning purpose that operators do what they should and no side-effect occurs (but this would be more a check for a lint style tool)...
 IMO, undefined is a hole in the language as it does not allows to be
 able to ensure that something is totally ordered. The only solution
 that I see is to have 2 posibilities:

Why does total ordering have to be a requirement? It is a fact that not all kinds of objects can have a total ordering.

It is not a requirement to have total ordering... I only want to be able to know if a type pretent to support total ordering or not... IMO, this can be done by using a distinct return type for both cases (or maybe 3 or 4 cases if we consider the other two (a and b above)). In fact this would be a case where extensible enumeration would make sens (can we extent enum in D - I was thinking so but I do see it in doc so I think it might be a suggestion I have seen in this newgroup) or it might some predefined type that have some special rules... Also, for mixed type comparisons, generally, I do not want to allows them and/or returns undefined... This might also be decided from a base class how to handle such comparisons... class A; class B : A; A a; B b; a < b; // or a == b; Here I like to be able from class A to choose how mixed type comparison are handled. I was thinking on using final or virtual to decide but I realize that both concept are essentially independant. So IMO, we should either uses a specific return type to tell the compiler what to do or have some attribute modifiers for the function to tell about its properties or uses different functions names or have some class attribute for that (and the compiler would knows which conversions are valids... and would even be able to throw an exception if desired or not compile...). And I would also like if the compiler allows us to define < style operator and cmp style one (and generate the other as required) Also, in C++, one can do some type_trait for such things although it is cumbersome... so if the programmer and the compiler can give more informations on the properties of types and methods, then it is possible to make the compilation fails if we want so or select an appropriate algorithm by specialisation. Here since D does not support multiple inheritance, we cannot easily associate some trait to a class that way...
Sep 19 2003
prev sibling next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
We already have such a thing for floating point numbers.  And for floats
there are operators defined to test for unordered.

Maybe if cmp returns an enum: less = -1, equal = 0, greater = 1, unordered =
0x80000000

Sean

"Hauke Duden" <H.NS.Duden gmx.net> wrote in message
news:bkc6o9$2kpc$1 digitaldaemon.com...
 Daniel Yokomiso wrote:
 "cmp" expresses total ordering, which is a relationship between the


 operators and the comparison operator. Most people expect total


 IME it's the most common case.

Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators. Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement: If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value. Hauke

Sep 18 2003
parent Hauke Duden <H.NS.Duden gmx.net> writes:
Sean L. Palmer wrote:
 We already have such a thing for floating point numbers.  And for floats
 there are operators defined to test for unordered.

Really? Is this described anywhere? The docs on digitalmars.com seem to be unfinished when it comes to this: """ In addition to the usual < <= > >= == != comparison operators, D adds more that are specific to floating point. These are [blah, blah, blah] and match the semantics for the NCEG extensions to C. [insert table here] """ Doesn't contain all that much information ;). Hauke
Sep 18 2003
prev sibling parent reply "Daniel Yokomiso" <daniel_yokomiso yahoo.com.br> writes:
"Hauke Duden" <H.NS.Duden gmx.net> escreveu na mensagem
news:bkc6o9$2kpc$1 digitaldaemon.com...
 Daniel Yokomiso wrote:
 "cmp" expresses total ordering, which is a relationship between the


 operators and the comparison operator. Most people expect total


 IME it's the most common case.

Of course, if you're dealing with such Tuple objects you need to be aware that the ordering is only partial. That is an issue with understanding the program, not an issue with the language. Also, that doesn't change the fact that there is an ordering among the objects which cannot currently be expressed with the comparison operators. Another possible solution just occurred to me. It is not as clean as having different operator functions but maybe easier to implement: If cmp could simply return a value "undefined" if there is no ordering relation between the two objects, then everything would work. All comparison operators would simply have to return false if they encounter that value. Hauke

While I went to work I realized another reason why partial ordering shouldn't be modeled as a single trait. While a object can have at most one total ordering relation, therefore this trait can be modelled as a single method (or set of methods), any object can have multiple partial ordering relations. Using your int pair example, we have: 1. Pairwise (x1,y1) < (x2,y2) iff (x1 < x2) and (y1 < y2) (x1,y1) > (x2,y2) iff (x1 > x2) and (y1 > y2) 2. Sum (x1,y1) < (x2,y2) iff (x1 + y1) < (x2 + y2) (x1,y1) > (x2,y2) iff (x1 + y1) > (x2 + y2) 3. Norm (x1,y1) < (x2,y2) iff sqrt(x1 + y1) < sqrt(x2 + y2) (x1,y1) > (x2,y2) iff sqrt(x1 + y1) > sqrt(x2 + y2) and so on. IME we should never model something that isn't as an unique trait, in this case a single "cmp" method. A solution would use object views: a.pairWise < b.pairWise a.sum < b.sum a.norm < b.norm Best regards, Daniel Yokomiso. "1. Out of clutter, find simplicity. 2. From discord, find harmony. 3. In the middle of difficulty lies opportunity." - Albert Einstein --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.518 / Virus Database: 316 - Release Date: 11/9/2003
Sep 18 2003
parent reply Hauke Duden <H.NS.Duden gmx.net> writes:
Daniel Yokomiso wrote:
 While I went to work I realized another reason why partial ordering
 shouldn't be modeled as a single trait. While a object can have at most one
 total ordering relation, therefore this trait can be modelled as a single
 method (or set of methods), any object can have multiple partial ordering
 relations. Using your int pair example, we have:

Actually, objects can also have multiple total orderings. The best example for this are strings, which can be ordered by different criteria. Digits could be regarded as smaller or bigger than alpha characters, for example. This is also not a theoretical thing, by the way. Different countries DO use different default string orderings. It is up to the programmer to decide which possible definition for <,=,> he uses. That is true for partially ordered objects as well as for totally ordered ones. Hauke
Sep 19 2003
parent reply Daniel Yokomiso <Daniel_member pathlink.com> writes:
In article <bkek1a$4q3$1 digitaldaemon.com>, Hauke Duden says...
Daniel Yokomiso wrote:
 While I went to work I realized another reason why partial ordering
 shouldn't be modeled as a single trait. While a object can have at most one
 total ordering relation, therefore this trait can be modelled as a single
 method (or set of methods), any object can have multiple partial ordering
 relations. Using your int pair example, we have:

Actually, objects can also have multiple total orderings. The best example for this are strings, which can be ordered by different criteria. Digits could be regarded as smaller or bigger than alpha characters, for example. This is also not a theoretical thing, by the way. Different countries DO use different default string orderings.

I know this I live in a country with different default string ordering ;) But I disagree with your example. Strings are sequences of characters, and characters are encodings, so an ASCII string is different from a EBCDIC string, even if they both are printed as "abcde". So the ordering is correctly defined, sequences are ordered by element and by length, while the characters are ordered according to the regional rules. You are right about possible multiple total orderings (but I can't think of an example now), but that is a problem of every system that contains a subtype relationship. An example is the hierarchy of number types, usually defined as: complex <- real <- rational <- integer This "natural" relation isn't the only one possible. We can map the integer "2" to the rational "2", but we also can map the same integer to the rational "1/2". It's a possible morphism, as are others. If we want subtyping we want an implicit "natural" morphism, and we choose the one that is more convenient. Ditto for total orderings, while theoretically we can have multiple total-orderings we choose the most common one as the default. I don't think that we can decide which partial ordering relation is the most common. For example IMO tuples of ordered elements should be ordered by element (giving a total ordering), as sequences are, instead of using another relation. If you want different semantics use a different type (e.g. define a pair or a point type).
It is up to the programmer to decide which possible definition for <,=,> 
he uses. That is true for partially ordered objects as well as for 
totally ordered ones.

As this kind of definition is an implicit assumption, IMO we shouldn't change the expected "not(a <= b) implies (a > b)" property of the ordering operators. C# has a similar thing, you can overload true and false for your class, so it's possible to use a code where "(b != true) implies not(b)" doesn't hold. There's an quick explanation about it from the horse's mouth: <http://www.devhood.com/messages/message_view-2.aspx?thread_id=72626>. IMO this is breaks lots of assumptions programmers have about how some piece of code should work, and the same goes for partial ordering operators.
Hauke

Best regards, Daniel Yokomiso. "Projects promoting programming in "natural languages" are intrinsically doomed to fail." - Edsger W. Dijkstra
Sep 19 2003
parent reply Hauke Duden <H.NS.Duden gmx.net> writes:
Daniel Yokomiso wrote:
Actually, objects can also have multiple total orderings. The best 
example for this are strings, which can be ordered by different 
criteria. Digits could be regarded as smaller or bigger than alpha 
characters, for example. This is also not a theoretical thing, by the 
way. Different countries DO use different default string orderings.

I know this I live in a country with different default string ordering ;) But I disagree with your example. Strings are sequences of characters, and characters are encodings, so an ASCII string is different from a EBCDIC string, even if they both are printed as "abcde". So the ordering is correctly defined, sequences are ordered by element and by length, while the characters are ordered according to the regional rules.

I didn't just mean different character sets, but also different opinions about which characters are smaller than others. Usually there are two ways to compare a string. One is to compare the character indices. This is often inadequate when the ordering becomes visible to the user (e.g. in graphical lists and stuff like that). The other is often called "collate" and it applies custom, locale-specific rules as to which characters come first. That is what I meant. Strings DO have different total orderings and they ARE used in practice. This is nothing inherent to partial orderings, so it cannot be used as an argument as to why partial order shouldn't be supported.
 You are right about possible multiple total orderings (but I can't think of an
 example now), but that is a problem of every system that contains a subtype
 relationship. An example is the hierarchy of number types, usually defined as:
 
 
 complex <- real <- rational <- integer
 
 
 This "natural" relation isn't the only one possible. We can map the integer "2"
 to the rational "2", but we also can map the same integer to the rational
"1/2".
 It's a possible morphism, as are others. If we want subtyping we want an
 implicit "natural" morphism, and we choose the one that is more convenient. 
 
 Ditto for total orderings, while theoretically we can have multiple
 total-orderings we choose the most common one as the default. I don't think
that
 we can decide which partial ordering relation is the most common.

Well, this really has nothing to do with the issue of wether partial ordering should be supported or not (as I have pointed out above), but nevertheless: The fact that an ordering is partial or not has no influence on wether there is a specific "canonical" ordering. As I tried to point out before, there are objects that also have different natural total orderings, depending on where you live. And apart from that, I also think that partially ordered objects often DO have a natural ordering as well. For example, the usual ordering for (x,y) tuples is (x1,y1) < (x2,y2) if x1<y1 && x2<y2. There are others (as has been pointed out before) but this one is common enough to be seen as a "default".
 As this kind of definition is an implicit assumption, IMO we shouldn't change
 the expected "not(a <= b) implies (a > b)" property of the ordering operators.
 C# has a similar thing, you can overload true and false for your class, so it's
 possible to use a code where "(b != true) implies not(b)" doesn't hold. There's
 an quick explanation about it from the horse's mouth:
 <http://www.devhood.com/messages/message_view-2.aspx?thread_id=72626>. IMO this
 is breaks lots of assumptions programmers have about how some piece of code
 should work, and the same goes for partial ordering operators.

Hmmm. I think this is another good point as to why partial orderings should be supported ;). Maybe the problem is just that we have different basic assumptions about the comparison operators: I think if a < b yields true then it means "a is smaller than b". And more importantly, if a < b yields false it means "a is not smaller than b" and nothing else. For you, it seems that a<b == false also means "a is bigger or equal to b". I.e. you automatically apply rules that work for integers. Maybe it is just that I have more of a mathematical background, but for me that last assumption is not automatically true, unless someone has proven that is so. Or, in a programming context, unless the documentation says so or it is immediately obvious from the type of the object. Hauke
Sep 19 2003
parent "Philippe Mori" <philippe_mori hotmail.com> writes:
 And apart from that, I also think that partially ordered objects often
 DO have a natural ordering as well. For example, the usual ordering for
 (x,y) tuples is (x1,y1) < (x2,y2) if  x1<y1 && x2<y2. There are others
 (as has been pointed out before) but this one is common enough to be
 seen as a "default".

IMO, the default should be : < when x1 < x2 || x1 == x2 && y1 < y2 == when x1 == x2 && y1 == y2 > otherwise (i.e. when x1 > x2 || x1 == x2 && y1 > y2) This is easy to implement. Do the check for x. If the result is 0 (equal) compare y.
 As this kind of definition is an implicit assumption, IMO we shouldn't


 the expected "not(a <= b) implies (a > b)" property of the ordering


 C# has a similar thing, you can overload true and false for your class,


 possible to use a code where "(b != true) implies not(b)" doesn't hold.


 an quick explanation about it from the horse's mouth:
 <http://www.devhood.com/messages/message_view-2.aspx?thread_id=72626>.


 is breaks lots of assumptions programmers have about how some piece of


 should work, and the same goes for partial ordering operators.

Hmmm. I think this is another good point as to why partial orderings should be supported ;).

I think that operator should always have the expected meaning and the compiler should have the right to validate it if possible and issues a warning if the test is done incorrectly since the compiler can often validate the code and this will prevent some bugs in complex comparisons...
 Maybe the problem is just that we have different basic assumptions about
 the comparison operators:

 I think if a < b yields true then it means "a is smaller than b". And
 more importantly, if a < b yields false it means "a is not smaller than
 b" and nothing else.

 For you, it seems that a<b == false also means "a is bigger or equal to
 b". I.e. you automatically apply rules that work for integers.

 Maybe it is just that I have more of a mathematical background, but for
 me that last assumption is not automatically true, unless someone has
 proven that is so. Or, in a programming context, unless the
 documentation says so or it is immediately obvious from the type of the
 object.

But this is what it is expected from most programmer and I think that the programmer should be able to express is intent through code instead of documentation as this allows better diagnostic from the compiler, bugs prevention and code that is documentedby itself will probably have a documentation that is more correct than a comment inserted by a developper and not modified laster when the function changes... So I thing that we should be able to associate an attribute with a class and/or with comparing function to indicate what kind of ordering it offer... And we could support multiple ordering functions that would be recognized as such and for which we will knows the ordering property. For partial ordering, it would be required to add the modifier (otherwise, it would not be possible to returns the undefined state). Does we have a syntax for such attibutes in D? int total_ordering cmp(...); // Total ordering --cannot returns undefined int partial_ordering cmp(...); A class cannot have both type with the same name but might have many comparison functions: int total_ordering cmp_case_insentitive(...); int total_ordering cmp_c_locale(...); int total_ordering cmp_international_with_number(...); total_ordering would be the default for cmp function and other functions would not have such attribute by default. It would also be possible to query at compile-time those attributes such that we can make a sort function that accept only total_ordering functions and issues a compile time error (using static assert) if we try to instanciate with the wrong type (since we would have different specialisation (again, implicit template instanciation would be usefull - if not fully automatic at least by a keyword instanciate and automatic type detection).
Sep 19 2003
prev sibling parent John Boucher <John_member pathlink.com> writes:
In article <bkbvpd$2bld$1 digitaldaemon.com>, Hauke Duden says...
I recently thought about the way D handles operator overloading and I 
noticed that the current implementation will not work well in a 
mathematical context.

In math you often work with sets of objects that are only partially 
ordered. You may be able to determine that a < b or b > a, but if none 
of these two things apply it doesn't necessarily mean that a==b.

For example, consider the set of integer tuples.

The following definitions apply:
(x1,y1) < (x2,y2) <=> x1<x2 && y1<y2
(x1,y1) > (x2,y2) <=> x1>x2 && y1>y2

The problem now is that you cannot use D's operator overloading for such 
tuples because there is no way to implement cmp properly.

Example:

You have the following code:

Tuple t1(2,2);
Tuple t2(1,3);

if( t1 < t2 )
	doStuff();

The compiler translates the comparison to a call of t1.cmp(t2).

But how to implement cmp for integer tuples?
- t1 is not < t2 since 3>2
- t2 is not > t2 since 1<2
- but t1 is also != t2 !!

There is not return value defined for such cases.

I really think it is necessary to have separate functions for <, <=, >, 
= comparisons. That way also partially ordered objects could be 

without having to state that "t1 >= t2" or "t1==t2". Hauke

I'm pretty sure I don't understand what you're talking about, but then of course the compiler doesn't either. Only _you_ know what these values relate to, and I can have another interpetation. _You_ have to tell me (and the compiler) how these things relate. To _me_ I'd say that these values are points on the Cartesian Plane and I don't know which you intend to have be "lesser" or "greater" or if that even applies. In some instances I could consider that the point closer to the X axis is "lesser", while in others I could consider that the point closer to the Y axis is "lesser". It depends on what I want to do with the values. And my program would specify my intentions, I can't rely on the language to have a construct for each and every thing I want to do. The act of programming is to _apply_ the language to a problem. We should not expect (or want) a language to have specialized tools for everything. <aside> This also brings to mind a situation I was working on a few weeks ago dealing with strings and applying greaterthan and lesserthan operators thereto. It occurred to me that s1 >= s2 if s2 can be "subtracted" from s1. Certainly this is true of numbers, so why not strings? Given that D defines +, and Perl has both + (well, .) and * (well, x) string operators, I proceded to define -, /, and %. I then proceded to define the comparison operators around this concept. Would anyone else in this forum want such a concept in their language? I doubt it, so I can make a string class that does it, but it should _not_ be built into the language. However, I likewise am unsure that the comparison operators should have been defined for strings in D the way they are. With numbers, the comparison operators can be used for two purposes; determining order, and determining whether or nor subtraction will succeed (without going negative). But with strings you can't have one operator that does both, so there should be no operator to do either. It's both or neither for me. </aside> <soapbox> This gets back to the situation with the assignment operator; in most earlier languages the equal sign was used for both assignment _and_ test for equality and so the meaning was determined by context -- in an if statement (or other test)it's a test for equality, otherwise it's assignment. Now C came along and wanted to allow assignment in tests so they had to have two separate operators, what they _should_ have done was to define two _new_ operators, one for each meaning and done away with the bare equal sign as an operator. But they didn't, causing so many bugs and wasted time. And we're still debating the situation thirty years later! The bare equal sign should be both or neither. </soapbox> John Boucher The King had Humpty pushed.
Sep 18 2003