www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Semantics of objects and '==' (and which operator for equality)

reply Bruno Medeiros <daiphoenix NOSPAMlycos.com> writes:
Hello. I would like to know what is the rationale behind the decision to 
make the '==' operator be an equality operator for objects, and not an 
"identity" one, like Java or C#. I did do a little search on the 
archives, but only found a non-official comment about this, and for what 
I read there it seems the ideia is to forget that objects are pointers 
and make them work more like basic types, like integers, thus using '==' 
to compare equality, is that right?

You see, my initial impression is that this behaviour is very inconsistent:

Objects can have two kinds of semantics for working with objects: 
pointer-like or basic-like (meaning non-pointer-like, like the 
basic/primitive types or C++-style references. I just came up with this 
name right now because I don't know another (possibly more correct) 
one). Anyway, In C++ you can have the two kinds, but in most modern 
languages you have only one kind. In Java and C# you have pointer-like 
object semantics.

The problem with D, is that objects behave like pointers in every 
regard, except when using '==' (or '!='), in which they behave like 
basic types. The '==' is used to test the equality of the object 
instance itself, instead of it's identity, or in another way to view it, 
the equality of the object pointer.
So in effect you have a mix of the two, something which I've never seen 
in any language before and that seems to me like a very inconsistent and 
unelegant conceptual model for objects.
Most popular languages out there, like C++, Java, C#, (and likely all 
C-syntax-based languages) do it pointer-like, and frankly I don't see 
any disadvantage (in terms of semantics and language/code 
expressiveness) in doing so. This has thus become a standard idiom for 
such languages.
And if even so, one wanted the objects to behave like basic non-pointer 
types, then you would have to make it behave consistently, that is, 
appply it everywere in the language, right? Not just for '=='. 'obj1 = 
ojb2' would then be a copy operation, (like it is with integers, or C++ 
local objects), and not a pointer assigment.

So it seems to me that it is much better to use '==' for 
identity(equality of pointer) and maybe something like 'eq' and 'neq' 
for object equality.

-- 
Bruno Medeiros
Computer Science/Engineering student
Jul 03 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Bruno Medeiros" <daiphoenix NOSPAMlycos.com> wrote in message 
news:da9lq0$23qu$1 digitaldaemon.com...
 Hello. I would like to know what is the rationale behind the decision to 
 make the '==' operator be an equality operator for objects, and not an 
 "identity" one, like Java or C#. I did do a little search on the archives, 
 but only found a non-official comment about this, and for what I read 
 there it seems the ideia is to forget that objects are pointers and make 
 them work more like basic types, like integers, thus using '==' to compare 
 equality, is that right?

 You see, my initial impression is that this behaviour is very 
 inconsistent:

 Objects can have two kinds of semantics for working with objects: 
 pointer-like or basic-like (meaning non-pointer-like, like the 
 basic/primitive types or C++-style references. I just came up with this 
 name right now because I don't know another (possibly more correct) one). 
 Anyway, In C++ you can have the two kinds, but in most modern languages 
 you have only one kind. In Java and C# you have pointer-like object 
 semantics.

 The problem with D, is that objects behave like pointers in every regard, 
 except when using '==' (or '!='), in which they behave like basic types. 
 The '==' is used to test the equality of the object instance itself, 
 instead of it's identity, or in another way to view it, the equality of 
 the object pointer.
 So in effect you have a mix of the two, something which I've never seen in 
 any language before and that seems to me like a very inconsistent and 
 unelegant conceptual model for objects.
 Most popular languages out there, like C++, Java, C#, (and likely all 
 C-syntax-based languages) do it pointer-like, and frankly I don't see any 
 disadvantage (in terms of semantics and language/code expressiveness) in 
 doing so. This has thus become a standard idiom for such languages.
 And if even so, one wanted the objects to behave like basic non-pointer 
 types, then you would have to make it behave consistently, that is, appply 
 it everywere in the language, right? Not just for '=='. 'obj1 = ojb2' 
 would then be a copy operation, (like it is with integers, or C++ local 
 objects), and not a pointer assigment.

 So it seems to me that it is much better to use '==' for identity(equality 
 of pointer) and maybe something like 'eq' and 'neq' for object equality.

Take a look on 'is' and '!is'
Jul 03 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Andrew Fedoniouk wrote:

So it seems to me that it is much better to use '==' for identity(equality 
of pointer) and maybe something like 'eq' and 'neq' for object equality.

Take a look on 'is' and '!is'

However, these are the *opposite* from the above pseudo-operators... As others have disliked too, '==' is equality and '===' (sorry; "is") is identity in the D language. There is plenty of discussion/argumentation/fighting in the archives. --anders
Jul 03 2005
parent reply Bruno Medeiros <daiphoenix NOSPAMlycos.com> writes:
Anders F Björklund wrote:
 Andrew Fedoniouk wrote:
 
 So it seems to me that it is much better to use '==' for 
 identity(equality of pointer) and maybe something like 'eq' and 'neq' 
 for object equality.

Take a look on 'is' and '!is'

However, these are the *opposite* from the above pseudo-operators... As others have disliked too, '==' is equality and '===' (sorry; "is") is identity in the D language. There is plenty of discussion/argumentation/fighting in the archives. --anders

If it is plenty, it's hard to find. I've spent some more time searching on the archives, and I have read a couple of threads, namely these: D - == object.cmp or same object ? http://www.digitalmars.com/d/archives/16851.html D - === / == / !== / != - a recipe for disaster! http://www.digitalmars.com/d/archives/18188.html D - null == o? http://www.digitalmars.com/d/archives/12144.html http://www.digitalmars.com/d/archives/digitalmars/D/22718.html D - Identity & equivalence http://www.digitalmars.com/d/archives/12141.html However, most of the posts there were about closely-related but different issues. Issues like if "===" should be renamed to something else (this was before "is"), or what should be the negative form of "is", or if equality "==" should check for null's, etc. Posts regarding the semantics of objects references and '==', were very few, and none of them were an extensive argument, but rather short summaries/comments [1]. But this, for what I understood, is because the blunt of the discussion ocurred a long time ago "in the beginning of time" as someone said (love that quote :P ) I did get an aditional short point in favor of "==" as an equality op: It was that doing so would be consistent with the <=,>=,etc. (the comparison ops) which also worked on the object instance. I do find this point to be true, but still not enought to justify the D semantic. I think so because the inconstency that exists between "==" and "=" is greater than what would otherwise exist betwen "==" and the comparision ops. Further so, the comparision ops don't even have meaning with these kind of pointers (the object references), in terms of working with the pointers themselves. I would still like hear some comments on this issue, recent ones preferably, as it would take a while to go back to the beggining of time :D . Namely, doesn't anyone else feel unconfortable that "==" and "=" work on different levels? [1] Well, actually I did find an extensive argument( D/12295 ), but in favor of my same opinion ( it's pretty much the same as reason 2 ), not to the contrary. -- Bruno Medeiros Computer Science/Engineering student
Jul 07 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Fri, 08 Jul 2005 00:30:56 +0100, Bruno Medeiros  
<daiphoenix NOSPAMlycos.com> wrote:
 I did get an aditional short point in favor of "==" as an equality op:  
 It was that doing so would be consistent with the <=,>=,etc. (the  
 comparison ops) which also worked on the object instance.

Clarification: They compare the 'values' of 'instances'. Instances referred to by references. Potentially the same instance in both cases, potentially the same reference in both cases. There are several key points here: 1. references have a value and an identity. 2. instances have a value and an identity. a references identity is the memory location where it is stored. a references' value is the instance to which it refers. an instances identity is the memory location where it is stored. an instances value is .. dependant on the contents of the instance.
 I do find this point to be true, but still not enought to justify the D  
 semantic. I think so because the inconstency that exists between "=="  
 and "=" is greater than what would otherwise exist betwen "==" and the  
 comparision ops.

I can't think how we'd measure the inconsistency so this is a personal opinion, right? In my opinion there is no inconsistency between "==" and "=" for reference types (see below).
 Further so, the comparision ops don't even have meaning with these kind  
 of pointers (the object references), in terms of working with the  
 pointers themselves.

I'd agree, using <op> to compare references identity is of very limited (no?) use. So if comparing reference identity is not really useful, why make <op> do it? (replace <op> with any comparrison operator)
 I would still like hear some comments on this issue, recent ones  
 preferably, as it would take a while to go back to the beggining of time  
 :D . Namely, doesn't anyone else feel unconfortable that "==" and "="  
 work on different levels?

Not I. Traditionally we think of assigning references as assigning identity, eg. //a and b have seperate identities and values. char[] a = "one"; char[] b = "two"; //a is destroyed, a and b have the same identity (and value). a = b; Notice when you carry out the above you're assigning both the references' identity and value. So, in effect "=" assigns both identity and value for a reference type. If "=" assigns value, then "==" comparing value is consistent. Of course, it all depends on how you look at it.
 [1] Well, actually I did find an extensive argument(  
 D/12295 ), but in favor of my  
 same opinion ( it's pretty much the same as reason 2 ), not to the  
 contrary.

Yes, note they also argued "Reason 1" and we've just agreed that identity comparrison is not the most common operation, nor particularly useful, right? Regan
Jul 07 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
I forgot one point. Theoretically there are 3 things <= could do for  
reference types:

A. compare reference identity. (memory address of reference)
B. compare refered instance identity. (memory address of instance)
C. compare refered instance value. (defined by instance - opCmp)

A and B seem, to me, to be the less likely cases. C seems the most likely  
case. Generally speaking that is, it is possible certain areas of  
programming may use A and B more often i.e. writing a container class for  
example.

Regan
Jul 07 2005
next sibling parent AJG <AJG_member pathlink.com> writes:
Hi,

In article <opstktafmf23k2f5 nrage.netwin.co.nz>, Regan Heath says...
I forgot one point. Theoretically there are 3 things <= could do for  
reference types:

A. compare reference identity. (memory address of reference)
B. compare refered instance identity. (memory address of instance)
C. compare refered instance value. (defined by instance - opCmp)

A and B seem, to me, to be the less likely cases. C seems the most likely  
case. Generally speaking that is, it is possible certain areas of  
programming may use A and B more often i.e. writing a container class for  
example.

I agree. Overall, option [C] seems more common and should be the default one. Same thing for equality (==) IMHO. For those blue moons when A and B are useful, I'm sure something could be done with the & operator? if (&ref1 <= &ref2) // or class Foo { int opCmp(Foo that) { return(cast(int) (&this - &that)); } } // etc... Whereas it's harder to get [C] to work if A or B is the default. No? --AJG.
Jul 07 2005
prev sibling parent Bruno Medeiros <daiphoenix NOSPAMlycos.com> writes:
Regan Heath wrote:
 I forgot one point. Theoretically there are 3 things <= could do for  
 reference types:
 
 A. compare reference identity. (memory address of reference)
 B. compare refered instance identity. (memory address of instance)
 C. compare refered instance value. (defined by instance - opCmp)
 
 A and B seem, to me, to be the less likely cases. C seems the most 
 likely  case. Generally speaking that is, it is possible certain areas 
 of  programming may use A and B more often i.e. writing a container 
 class for  example.
 
 Regan

As for A, a "reference identity" is a pointer to a reference, which is not a reference, it's a normal pointer, thus A is an indepedent and separate issue of how "<=" can be applied to references. The primitive comparision ops are applied to that pointer, regards of what type it points to. B is not unlikely or likely, it does not make sense. Pointer comparision only makes sense for pointers who point to elements on the same array, which never happens for an object reference. So, like I said, there is only one thing that comparision ops can do to references (option C). -- Bruno Medeiros Computer Science/Engineering student
Jul 08 2005
prev sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Bruno Medeiros wrote:

 There is plenty of discussion/argumentation/fighting in the archives.

If it is plenty, it's hard to find. I've spent some more time searching on the archives, and I have read a couple of threads, namely these:

Most of the things in the D archives are hard to find, but the thread I was thinking about was the one that Kris started a while ago on the very topic of ==/===: http://www.digitalmars.com/d/archives/digitalmars/D/21060.html
 I did get an aditional short point in favor of "==" as an equality op: 
 It was that doing so would be consistent with the <=,>=,etc. (the 
 comparison ops) which also worked on the object instance. I do find this 
 point to be true, but still not enought to justify the D semantic. I 
 think so because the inconstency that exists between "==" and "=" is 
 greater than what would otherwise exist betwen "==" and the comparision 
 ops. Further so, the comparision ops don't even have meaning with these 
 kind of pointers (the object references), in terms of working with the 
 pointers themselves.

I think that analogy is rather natural, since '==' for the objects calls this.opEquals(that) whether it crashes or not and since this.opCmp(that) is defined for *all* objects whether it has a natural meaning or not... (although that's a another discussion, like you said, just "annoying") So it seems they're just defined as what is the easiest to implement ? (and not what is most consistant). Then again, I always liked the '==='. And not everyone seems to think that using '==' was all that bad, e.g.: http://cdsmith.twu.net/professional/java/pontifications/comparison.html
 I would still like hear some comments on this issue, recent ones 
 preferably, as it would take a while to go back to the beggining of time 
 :D . Namely, doesn't anyone else feel unconfortable that "==" and "=" 
 work on different levels?

I don't think that was the main reason last "fight", but that it worked differently in D than what does in Java (or with pointer types in C++) ? Or something, as I think I got the "most misleading post ever" award :-) (which means that I will not try any further summaries this time around) To be honest, I'm starting to take the position of "whatever makes it ready for release sooner" - or for this language boat to stop rocking... So I think I'll stay out of it this time and check back in a month or so, seeing if anything really happened to the actual language. :-P --anders
Jul 07 2005
prev sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Bruno Medeiros wrote:

 And if even so, one wanted the objects to behave like basic non-pointer
 types, then you would have to make it behave consistently, that is,
 appply it everywere in the language, right? Not just for '=='. 'obj1 =
 ojb2' would then be a copy operation, (like it is with integers, or C++
 local objects), and not a pointer assigment.

 So it seems to me that it is much better to use '==' for 
 identity(equality of pointer) and maybe something like 'eq' and 'neq' 
 for object equality.

Another suggestion was to switch '==' and 'is' around, so that you could use '==' for identity and use 'is' to call upon the opEquals ? I always thought that to be rather confusing, especially at the time since it was still called '===' back then. Which has a meaning to me. And if you are suggesting "this.opEquals(that)", I don't want it :-) I know that others do though, and it seems to be working out for Java. Having something like 'eq' and 'ne', like Perl does, would be somewhat different but could work. It would make you search for all the rest of them, though ('lt' 'gt' 'le' 'ge')* and wonder what the difference is ? Like it being ok to use "a == b" for two char[], but having to use "a eq b" for two String ? Or "x != y" for two int, but "x ne y" for two Integer ? (not that wrapper classes are recommended, but...) By the way, how *do* you "clone" an object in D ? I know that .dup works for the array types, but how to do a shallow copy for objects ? It's funny, I thought this was "confusing" enough with '===' and '==' without having to drag '=' into it... But the more the merrier ? ;-) --anders PS. * I would still like to see the starship operator in D ('<=>'), pronounced 'cmp' and calls opCmp - it comes in handy for sorting.
Jul 08 2005
parent reply Bruno Medeiros <daiphoenix NOSPAMlycos.com> writes:
Anders F Björklund wrote:
 Bruno Medeiros wrote:
 
 And if even so, one wanted the objects to behave like basic non-pointer
 types, then you would have to make it behave consistently, that is,
 appply it everywere in the language, right? Not just for '=='. 'obj1 =
 ojb2' would then be a copy operation, (like it is with integers, or C++
 local objects), and not a pointer assigment.

 So it seems to me that it is much better to use '==' for 
 identity(equality of pointer) and maybe something like 'eq' and 'neq' 
 for object equality.

Another suggestion was to switch '==' and 'is' around, so that you could use '==' for identity and use 'is' to call upon the opEquals ? I always thought that to be rather confusing, especially at the time since it was still called '===' back then. Which has a meaning to me. And if you are suggesting "this.opEquals(that)", I don't want it :-) I know that others do though, and it seems to be working out for Java. Having something like 'eq' and 'ne', like Perl does, would be somewhat different but could work. It would make you search for all the rest of them, though ('lt' 'gt' 'le' 'ge')* and wonder what the difference is ? Like it being ok to use "a == b" for two char[], but having to use "a eq b" for two String ? Or "x != y" for two int, but "x ne y" for two Integer ? (not that wrapper classes are recommended, but...) By the way, how *do* you "clone" an object in D ? I know that .dup works for the array types, but how to do a shallow copy for objects ? It's funny, I thought this was "confusing" enough with '===' and '==' without having to drag '=' into it... But the more the merrier ? ;-) --anders PS. * I would still like to see the starship operator in D ('<=>'), pronounced 'cmp' and calls opCmp - it comes in handy for sorting.

Honestly, I have to say that in most of you post I didnt clearly understand your point. In any case let me just clarify some things up: If "==" were changed to be a reference equality op, then "is" shouldnt be the value equality op (i.e., no swaping). If we wanted an equality op in this case, it should be named "eq" (plus "neq" or "!eq") or something like that. As for your other post, where you mentioned the thread that Kris started, well, I read a bit of it, but gave up soon. It was too long, had many unrelated posts (such as about isnot and other stuff) and the other ones where often misconstrued or had some incorrectnesses, plus I would had to read the whole thing flated, instead of threaded. Anyway, I'm gonna postpone this issue for while. I'm busy with my final exams, and when they're finished I'm planing in the summer holidays to investigate and learn D more and do some real work with it. I will try the current "==" design and see if i can acommodate myself with it, though I doubt that. And failing that, there is allway some good ol': sed -e "s/==/is/" -e "s/!=/!is/" :) -- Bruno Medeiros Computer Science/Engineering student
Jul 08 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Bruno Medeiros wrote:

 Honestly, I have to say that in most of you post I didnt clearly 
 understand your point.

That is OK, maybe I was adding too many different things into it...
 In any case let me just clarify some things up: 
 If "==" were changed to be a reference equality op, then "is" shouldnt 
 be the value equality op (i.e., no swaping). If we wanted an equality op 
 in this case, it should be named "eq" (plus "neq" or "!eq") or something 
 like that.

Agreed, and I think the "swapping" idea was just because if you use identity on primitive types like int - it does fall back to equality ? I'm certainly not proposing it, and I don't think that D needs any more compare operators like the ones proposed ("eq"/"neq") either... BTW; I would prefer "ne" over "neq", but that's just because I'm an old user of Perl (and even Assembler) where that syntax is being used.
 As for your other post, where you mentioned the thread that Kris 
 started, well, I read a bit of it, but gave up soon. It was too long, 
 had many unrelated posts (such as about isnot and other stuff) and the 
 other ones where often misconstrued or had some incorrectnesses, plus I 
 would had to read the whole thing flated, instead of threaded.

I'm not sure it was that much easier to follow back when it happened :-) It's a lot easier if you read it in Thunderbird, which does threading ? To be honest, I'm not sure what came out of - except for the "!is".
 Anyway, I'm gonna postpone this issue for while. I'm busy with my final 
 exams, and when they're finished I'm planing in the summer holidays to 
 investigate and learn D more and do some real work with it. I will try 
 the current "==" design and see if i can acommodate myself with it, 
 though I doubt that. And failing that, there is allway some good ol': 
 sed -e "s/==/is/" -e "s/!=/!is/"   :)

That is what you need to do with most C++/Java code to "port", as well. Especially since the construct "object == null" is likely to segfault ? But for the time being, I don't think that there'll be any more changes. --anders
Jul 09 2005