www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - "is null" vs "== null"

reply "Søren J. Løvborg" <web kwi.dk> writes:
I find it problematic that comparing null-values with the == operator 
crashes D programs.

I'm aware of the reason (namely, that a == b is translated to 
a.opEquals(b) ), but nevertheless, I'm afraid this may be a big newbie-trap, 
and that it will cause people to avoid D.

At the very least, the compiler should trap obviously wrong expressions like 
"obj == null" and "obj != null", which simply doesn't make sense with the 
current semantics of the == operator.

But also for experienced D programmers, I believe the current == semantics 
to be dangerous, in that a potential segmentation fault lies hidden behind 
an innocuous == operator.

Since a null value often represents abnormal conditions, such a potential 
crash could easily go unnoticed, even with plenty of testing. How do you 
explain to a client that their server crashed with a segmentation fault 
because you accidentally used "==" instead of "is" on a value (that 
unfortunately never became null during testing)?

Such problems may be due to bad software engineering, but I believe that 
they are very real.

I'd much rather see the == operator extended to deal with null-values.
    if (a == b) { ... }
should be translated to
    if (a == null ? (b == null) : a.opEquals(b)) { ... }

Pro: No hidden threat of segfaults/GPFs in a simple equality test. Plus, it 
makes sense that null equals null (and null only), even when comparing 
values ("==") rather than references ("is").

Con: This would add some overhead, though I believe that it would be 
negligible to the existing overhead of the opEquals call. For performance 
intensive operations, one should use the "is" operator, or call opEquals 
explicitly.

Thoughts?

Søren J. Løvborg
web kwi.dk 
Mar 25 2006
next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Søren J. Løvborg wrote:

 I find it problematic that comparing null-values with the == operator 
 crashes D programs.

Yes, this is problematic and it has been up for discussion before... Some old threads, for reference: 2005 http://www.digitalmars.com/d/archives/digitalmars/D/21225.html 2003 http://www.digitalmars.com/d/archives/12144.html But I don't think that the language position on "null" has changed... A segfault is viewed as an exception, since it throws one on Windows. And comparing with null is not defined, thus throwing one is OK there. (Taken freely from D/13854) It is here being compared to accessing outside of the array, or similar.
 I'd much rather see the == operator extended to deal with null-values.

I've just learned to live with the segfaults, when programming in D. :-( Fortunately they're pretty easy to track down in the debugger, if found. --anders
Mar 25 2006
parent reply "Søren J. Løvborg" <web kwi.dk> writes:
Anders F Björklund wrote:
 I find it problematic that comparing null-values with the == operator 
 crashes D programs.

Yes, this is problematic and it has been up for discussion before... Some old threads, for reference: 2005 http://www.digitalmars.com/d/archives/digitalmars/D/21225.html 2003 http://www.digitalmars.com/d/archives/12144.html But I don't think that the language position on "null" has changed...

Walther wrote (on Fri, 13 Jun 2003):
 So, think of (o==p) with objects as "Compare the contents of object o with
 object p. If either is null, there are no contents, and that is outside 
 the
 scope of the == operator's purpose. Therefore, an exception is thrown just
 as if an array bounds were exceeded. If the object references are not
 supposed to be null, then it's a program bug. If the object references can
 be null, then explicitly code what a null reference means.

So == doesn't handle nulls since nulls are defined to be outside the scope of its purpose. The question is whether == would be more useful to programmers, if its scope was extended to include null values. I'll venture to say that most people will expect that null == null, and at the very least, it seems that many (most?) people on the NG agree. Since comparing against null is illegal in the current specification, the change could be made without affecting existing programs. Besides Walther, who's already stated his reasons, anyone else who's against such a change?
 A segfault is viewed as an exception, since it throws one on Windows.

Søren J. Løvborg web kwi.dk
Mar 26 2006
parent reply "Lionello Lunesu" <lio remove.lunesu.com> writes:
 I'll venture to say that most people will expect that null == null, and at 
 the very least, it seems that many (most?) people on the NG agree.

What about the general case: ptr == ptr (which includes the case where ptr==null: null == null)? Should the compiler inhibit the call to opEquals in the general case as well? But settling this issue will not settle case where one of the arguments to "==" is null. I saw some opEquals implementation somewhere that started with the line "if (o is null) return false;", which seems silly, since if the opEquals would have been called the other way around (b == a, instead of a == b) then the program would have surely crashed (being a virtual call, "this" must be valid to access the virtual function table). Right? "a==b" should in any case be identical to "b==a". I guess the only thing we can agree on is that the compiler should emit an error message when it encounters code like "if (whatever == null)" or "if (null == whatever)". L.
Mar 27 2006
parent =?ISO-8859-1?Q?Jari-Matti_M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Lionello Lunesu wrote:
 I'll venture to say that most people will expect that null == null, and at 
 the very least, it seems that many (most?) people on the NG agree.

What about the general case: ptr == ptr (which includes the case where ptr==null: null == null)? Should the compiler inhibit the call to opEquals in the general case as well? But settling this issue will not settle case where one of the arguments to "==" is null. I saw some opEquals implementation somewhere that started with the line "if (o is null) return false;", which seems silly, since if the opEquals would have been called the other way around (b == a, instead of a == b) then the program would have surely crashed (being a virtual call, "this" must be valid to access the virtual function table). Right? "a==b" should in any case be identical to "b==a".

That's right. Equality is a symmetric operation in mathematics. OTOH, floating point values work correctly in D and D even has complex numbers as a type. Thus as a "mathematical" :) language D should IMO definitely keep the currently working equivalence semantics. -- Jari-Matti
Apr 03 2006
prev sibling parent reply "Unknown W. Brackets" <unknown simplemachines.org> writes:
I have a silly question.

When is comparing against == with opEquals *not* an error?  I mean, when 
should that ever pass if opEquals is available to be called?

I can see a class that says, "I'm null currently" but that seems about 
as logical as a class that overloads operators to do bizarre things.

-[Unknown]
Mar 25 2006
parent "Søren J. Løvborg" <web kwi.dk> writes:
Unknown W. Brackets wrote:
 When is comparing against == with opEquals *not* an error?  I mean, when 
 should that ever pass if opEquals is available to be called?
 I can see a class that says, "I'm null currently" but that seems about as 
 logical as a class that overloads operators to do bizarre things.

Assuming you meant "comparing against _null_ with opEquals", I agree. This is why I believe null should be defined as being equal to null (and only null), both when comparing references and when comparing values. Søren J. Løvborg web kwi.dk
Mar 26 2006