www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - C# interview

reply bearophile <bearophileHUGS lycos.com> writes:
Interview to Anders Hejlsberg, one C# author:
http://www.computerworld.com.au/index.php/id;1149786074;fp;;fpid;;pf;1

This is one of the questions:
Would you do anything differently in developing C# if you had the chance?<

This part of the answer shows a future feature of D (note that Delight already has this):
50% of the bugs that people run into today, coding with C# in our platform, and
the same is true of Java for that matter, are probably null reference
exceptions. If we had had a stronger type system that would allow you to say
that 'this parameter may never be null, and you compiler please check that at
every call, by doing static analysis of the code'. Then we could have stamped
out classes of bugs.

Another question near the end:
Where do you think programming languages will be heading in the future,
particularly in the next 5 to 20 years?<

My answer: OMeta is the future. Bye, bearophile
Oct 01 2008
next sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
bearophile wrote:

 This part of the answer shows a future feature of D (note that Delight already
has this):
 
 50% of the bugs that people run into today, coding with C# in our
 platform, and the same is true of Java for that matter, are
 probably null reference exceptions. If we had had a stronger type
 system that would allow you to say that 'this parameter may never
 be null, and you compiler please check that at every call, by doing
 static analysis of the code'. Then we could have stamped out
 classes of bugs.


The Delight feature was inspired by Nice, http://nice.sourceforge.net/ """One documents whether a type can hold the null value by simply prefixing it with the ? symbol. Thus: ?String name; is the declaration of String variable that might be null. To get its length, one must write int len = (name == null) ? 0 : name.length();. Calling name.length() directly is a type error. It would only be possible if name was defined as String name = "Some non-null string";.""" Other parts are now in main Java, documentation talking about JDK 1.3... http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5030232 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6207924 --anders
Oct 02 2008
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
bearophile escribió:
 Interview to Anders Hejlsberg, one C# author:
 http://www.computerworld.com.au/index.php/id;1149786074;fp;;fpid;;pf;1
 
 This is one of the questions:
 Would you do anything differently in developing C# if you had the chance?<

This part of the answer shows a future feature of D (note that Delight already has this):
 50% of the bugs that people run into today, coding with C# in our platform,
and the same is true of Java for that matter, are probably null reference
exceptions. If we had had a stronger type system that would allow you to say
that 'this parameter may never be null, and you compiler please check that at
every call, by doing static analysis of the code'. Then we could have stamped
out classes of bugs.


What's that feature in D? If null references are the most problematic (and I must agree because I started a project in D and I got null referneces a few times already, but no other "bug"), why is it that in debug mode asserts are not put for every reference access, to show "somefile.d(58): Error: Null Reference" instead of just "Error: Access Violation"? :-( No, instead you have to open a debugger to discover the file and line number... What's the point of losing that time?
Oct 02 2008
next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 2, 2008 at 7:25 AM, Ary Borenszweig <ary esperanto.org.ar> wrote:

 If null references are the most problematic (and I must agree because I
 started a project in D and I got null referneces a few times already, but no
 other "bug"), why is it that in debug mode asserts are not put for every
 reference access, to show "somefile.d(58): Error: Null Reference" instead of
 just "Error: Access Violation"? :-(

 No, instead you have to open a debugger to discover the file and line
 number... What's the point of losing that time?

Because -- according to Walter -- access violations are much better than how they had it in the old days before memory protection (back when programmers had to walk 20 miles uphill both ways in the snow to work), so you'd better be satisfied with them. I think it's an awful argument.
Oct 02 2008
parent Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Jarrett Billingsley wrote:
 On Thu, Oct 2, 2008 at 7:25 AM, Ary Borenszweig <ary esperanto.org.ar> wrote:
 No, instead you have to open a debugger to discover the file and line
 number... What's the point of losing that time?

Because -- according to Walter -- access violations are much better than how they had it in the old days before memory protection (back when programmers had to walk 20 miles uphill both ways in the snow to work), so you'd better be satisfied with them. I think it's an awful argument.

Not if you consider what I blabbered about at the TConf - stack tracing integrated with the runtime (and DDL!) :) It would skip the runtime cost while shifting it into executable size (cause you'd need -g) and you could still disable it in release code. -- Tomasz Stachowiak http://h3.team0xf.com/ h3/h3r3tic on #D freenode
Oct 02 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Thu, Oct 2, 2008 at 9:43 AM, Tom S <h3r3tic remove.mat.uni.torun.pl> wrote:
 Jarrett Billingsley wrote:
 On Thu, Oct 2, 2008 at 7:25 AM, Ary Borenszweig <ary esperanto.org.ar>
 wrote:
 No, instead you have to open a debugger to discover the file and line
 number... What's the point of losing that time?

Because -- according to Walter -- access violations are much better than how they had it in the old days before memory protection (back when programmers had to walk 20 miles uphill both ways in the snow to work), so you'd better be satisfied with them. I think it's an awful argument.

Not if you consider what I blabbered about at the TConf - stack tracing integrated with the runtime (and DDL!) :) It would skip the runtime cost while shifting it into executable size (cause you'd need -g) and you could still disable it in release code.

That's true.
Oct 02 2008
prev sibling next sibling parent Chad J <gamerchad __spam.is.bad__gmail.com> writes:
Ary Borenszweig wrote:
 
 What's that feature in D?
 
 If null references are the most problematic (and I must agree because I 
 started a project in D and I got null referneces a few times already, 
 but no other "bug"), why is it that in debug mode asserts are not put 
 for every reference access, to show "somefile.d(58): Error: Null 
 Reference" instead of just "Error: Access Violation"? :-(
 
 No, instead you have to open a debugger to discover the file and line 
 number... What's the point of losing that time?

I agree. That's always irritated me.
Oct 02 2008
prev sibling parent Christopher Wright <dhasenan gmail.com> writes:
Ary Borenszweig wrote:
 What's that feature in D?
 
 If null references are the most problematic (and I must agree because I 
 started a project in D and I got null referneces a few times already, 
 but no other "bug"), why is it that in debug mode asserts are not put 
 for every reference access, to show "somefile.d(58): Error: Null 
 Reference" instead of just "Error: Access Violation"? :-(
 
 No, instead you have to open a debugger to discover the file and line 
 number... What's the point of losing that time?

You get a stacktrace if you use jive or the Windows equivalent. Back to the original topic, almost all of the time, you don't encounter this bug. So it doesn't make any sense to introduce this to the type system, at least not with the default being the method requiring additional logic. Groovy has a ?. operator: int i = foo?.bar; => int i; if (foo is null) i = default(typeof(foo.bar)); else i = foo.bar; That operator costs one additional character and pretty much no thought.
Oct 02 2008
prev sibling next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 02 Oct 2008 05:03:40 +0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Interview to Anders Hejlsberg, one C# author:
 http://www.computerworld.com.au/index.php/id;1149786074;fp;;fpid;;pf;1

 This is one of the questions:
 Would you do anything differently in developing C# if you had the  
 chance?<

This part of the answer shows a future feature of D (note that Delight already has this):
 50% of the bugs that people run into today, coding with C# in our  
 platform, and the same is true of Java for that matter, are probably  
 null reference exceptions. If we had had a stronger type system that  
 would allow you to say that 'this parameter may never be null, and you  
 compiler please check that at every call, by doing static analysis of  
 the code'. Then we could have stamped out classes of bugs.


It is a *very* good feature to have! I understand this as follows: No class reference can have a null value, i.e. you can't write like this: Object o; o = new Object(); because o will store a null reference for a short time. This is not valid, too: Object o = new Object(); o = null; This, however, is ok: Object? o = null; Object o2 = o; // ok, but throws an exception if o is null void foo(Object? bar); foo(null); // ok void bar(Object baz); bar(null); // compile-time error Object o = ...; bar(o); // always fine (because o can't be null) However, T? needs language support. But once opImplicitCast will be implement it won't need language support anymore: struct Nullable(T) // called Likely in "The Power of None" by Andrei Alexandrescu (http://www.nwcpp.org/Downloads/2006/The_Power_of_None.ppt) { private void* ptr; this(T ptr) { this.ptr = ptr; } T opImplicitCast() { if (ptr is null) throw new NullReferenceException(); return cast(T)ptr; } } T? is still nicier than Nullable!(T) to type. T?, however, might be a syntactic sugar for Nullable template. It can be further extended so that no T type would have T.init property: int i1; // can't leave uninitialized int? i2; // ok, initialized to null int t1 = i1; // safe int t2 = i2; // throws if i2 is null int? find(T needle); // returns null if object is not found. Compare to returning 0 or -1 or InvalidIndex or std::string::npos etc.
Oct 02 2008
parent reply Don <nospam nospam.com.au> writes:
Denis Koroskin wrote:
 The two things that needs to be changed to support this feature are:
 
 1) make typeof(null) == void*
 2) remove default initializers (for reference types, at least)
 
 The latter rule can be relaxed (as done in C#): you can have a variable 
 uninitialized. However, you can't read from it until you initialize it 
 explicitly. This is enforced statically:
 
 // The following is ok:
 Object o;
 o = new Object();
 
 // This one is ok, too:
 Object o;
 if (condition) {
    o = new Foo();
 } else {
    o = new Bar();
 }
 
 // But this is rejected:
 Object o;
 if (condition) {
    o = new Foo();
 }
 
 Object o2 = o; // use of (possibly) uninitialized variable

Why not just disallow uninitialized references? So none of the examples would compile, unless you wrote: Object o = new Object(); or Object o = null; The statement that "50% of all bugs are of this type" is consistent with my experience. This wouldn't get all of them, but I reckon it'd catch most of them. I can't see any advantage from omitting the "=null;"
Oct 06 2008
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-10-06 06:13:38 -0400, "Denis Koroskin" <2korden gmail.com> said:

 string foo(Bar o)
 {
      // should we check for null or not?
      return o.toString();
 }
 
 Bar getBar()
 {
      if (...) {
          return new Bar();
      } else {
          // should we throw an exception or return null?
          return null;
      }
 }
 
 // should we check for null here?
 string str = foo( getBar() );
 
 Too much choices. The better code would be as follows:
 
 string foo(Bar o)
 {
      // no checking needed here, o can't be null!
      return o.toString();
 }
 
 Bar? getBar()
 {
      if (...) {
          return new Bar();
      } else {
          // ok, we can return null and user *knows* about that!
          // this is very important
          return null;
      }
 }
 
 string str = foo( getBar() ); // might throw a NullReference exception  
 (just what we need!)
 
 alternatively you can make checking yourself:
 
 Bar? bar = getBar();
 string str = (bar is null) ? "underfined" : foo(bar); // ok, it's safe
 
 Please, comment!

I like it. And I think it should extend to any pointer. But would this: char* a; // non-null pointer become this: char*? b; // "?" makes the pointer nullable or this: char? b; // "?" means nullable pointer ? "char?" doesn't seem to work since then for consistency "Object?" should be a nullable pointer to an Object reference. But then, "char*?" isn't very appealing. Oh, and does that mean we now have "const(Object)?" to create non-const pointers to const objects? That'd be nice. :-) Perhaps "Object" should just be a shortcut for "Object*", which would mean that you could write "const(Object)*" if you wanted, and "char?" could work as a nullable pointer. But then, "==" wouldn't work right for objects (because it'd have to compare pointers), am I right? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2008
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-10-06 09:11:41 -0400, "Denis Koroskin" <2korden gmail.com> said:

 char? <-> Nullable (char)
 char*? <-> Nullable (char*)

Ok, I think see your point. We only got different interpretations. Mine was that only pointers (and object reference) could be made null, therfore, only pointers and object references could be nullable, but wouldn't be by default. Your idea is that any type can be made nullable. For pointers and object references, this would be represented by null pointers; for other types it'd have to be some kind of struct containing that type and a boolean null-indicator value. Is this correct? Although I don't dislike this idea, I think nullable value-types offer much less value than nullable pointers and references. For one, making value-types nullable isn't a solution for the problem at hand -- null pointer errors (access violation, segmentation faults, etc.) -- because value-types already don't have this problem. For two, it makes the type memory layout bigger instead of simply allowing it to hold the special zero value. That said, I'm not against nullable value-types. My opinion is that perhaps the language could be kept simpler by only allowing pointer and object references to be nullable, because that's where it matters the most. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Oct 06 2008
parent Don <nospam nospam.com.au> writes:
Denis Koroskin wrote:
 On Tue, 07 Oct 2008 05:55:30 +0400, Michel Fortin 
 That said, I'm not against nullable value-types. My opinion is that 
 perhaps the language could be kept simpler by only allowing pointer 
 and object references to be nullable, because that's where it matters 
 the most.

Yes, you rarely need use of them, and in most cases they are local variables created on stack so the cost is minimal. That said, they are not crucial for the language, but it is better to have them for consistency with reference types.

References are NOT values. Don't pretend they are the same. "A foolish consistency is the hobgoblin of little minds."
Oct 07 2008
prev sibling parent reply Don <nospam nospam.com.au> writes:
Denis Koroskin wrote:
 On Mon, 06 Oct 2008 13:58:39 +0400, Don <nospam nospam.com.au> wrote:
 
 Denis Koroskin wrote:
 The two things that needs to be changed to support this feature are:
  1) make typeof(null) == void*
 2) remove default initializers (for reference types, at least)
  The latter rule can be relaxed (as done in C#): you can have a 
 variable uninitialized. However, you can't read from it until you 
 initialize it explicitly. This is enforced statically:
  // The following is ok:
 Object o;
 o = new Object();
  // This one is ok, too:
 Object o;
 if (condition) {
    o = new Foo();
 } else {
    o = new Bar();
 }
  // But this is rejected:
 Object o;
 if (condition) {
    o = new Foo();
 }
  Object o2 = o; // use of (possibly) uninitialized variable

Why not just disallow uninitialized references? So none of the examples would compile, unless you wrote: Object o = new Object(); or Object o = null; The statement that "50% of all bugs are of this type" is consistent with my experience. This wouldn't get all of them, but I reckon it'd catch most of them. I can't see any advantage from omitting the "=null;"

That's the whole point - you won't need to check for (o is null) ever. All the references are always valid. This might be a good contract to enforce.

I think you misunderstood me. I wrote "=null" not "==null". It's my experience that most null pointer bugs in D are caused by simply forgetting to initialize the reference. Anyone coming from C++ is extremely likely to do this. This bug could be completely eliminated by requiring objects to be initialized. If you want them to be be uninitialised, fine, set them equal to null. Make your intentions clear. Then, an object reference could only ever be null if it was explicitly set to null. There are some null pointer exceptions which are caused by returning a null by accident, but they're much rarer, and they're not much different to any other logic error. Yes, they would be eliminated in your proposal. But the cost-benefit is not nearly as good as eliminating the default null initialisation; the benefit is higher, but the cost is hundreds of times higher.
Oct 07 2008
parent reply Don <nospam nospam.com.au> writes:
Denis Koroskin wrote:
 On Tue, 07 Oct 2008 13:33:16 +0400, Don <nospam nospam.com.au> wrote:
 
 Denis Koroskin wrote:
 On Mon, 06 Oct 2008 13:58:39 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 The two things that needs to be changed to support this feature are:
  1) make typeof(null) == void*
 2) remove default initializers (for reference types, at least)
  The latter rule can be relaxed (as done in C#): you can have a 
 variable uninitialized. However, you can't read from it until you 
 initialize it explicitly. This is enforced statically:
  // The following is ok:
 Object o;
 o = new Object();
  // This one is ok, too:
 Object o;
 if (condition) {
    o = new Foo();
 } else {
    o = new Bar();
 }
  // But this is rejected:
 Object o;
 if (condition) {
    o = new Foo();
 }
  Object o2 = o; // use of (possibly) uninitialized variable

Why not just disallow uninitialized references? So none of the examples would compile, unless you wrote: Object o = new Object(); or Object o = null; The statement that "50% of all bugs are of this type" is consistent with my experience. This wouldn't get all of them, but I reckon it'd catch most of them. I can't see any advantage from omitting the "=null;"

ever. All the references are always valid. This might be a good contract to enforce.

I think you misunderstood me. I wrote "=null" not "==null". It's my experience that most null pointer bugs in D are caused by simply forgetting to initialize the reference. Anyone coming from C++ is extremely likely to do this. This bug could be completely eliminated by requiring objects to be initialized. If you want them to be be uninitialised, fine, set them equal to null. Make your intentions clear.

Well, there is no difference between Object o; and Object o = null; - they both will be initialized to null. But I agree that explicit initialization makes an intention more clear.
 Then, an object reference could only ever be null if it was explicitly 
 set to null.

Not, actually: void foo(Bar b) { Bar bb = b; // bb is null == b is null }

But b must have been explicitly set to null. Someone has made a conscious decision that some reference is permitted to be null. Right now, it can happen by accident.
 There are some null pointer exceptions which are caused by returning a 
 null by accident, but they're much rarer, and they're not much 
 different to any other logic error.

function always returns valid reference is a good contract. User will be aware of the potential null returned.
 Yes, they would be eliminated in your proposal. But the cost-benefit 
 is not nearly as good as eliminating the default null initialisation; 
 the benefit is higher, but the cost is hundreds of times higher.


I wasn't talking about runtime cost, but cost to the language (increase in complexity of the spec, Walter's time). It requires _major_ language changes. New lexing, new types, new name mangling, new code generation. Actually, I thought that my proposal will lead to
 better optimizations (while ensuring high robustness) to what we have 
 now. For example, now every function that accepts a class reference 
 ought to check the pointer against null. With my proposal you should 
 check just once! upon casting from T? to T. Once you got a reference, it 
 is not null so no checking against null is necessary. And since you 
 operate on T instead of T? all the time (and assigning from T? to T? 
 doesn't need any checking either) it is rather enough to ignore the cost.

True. I don't disagree that non-nullable references would be nice to have. I just think that it's unlikely that Walter would do it. My point is simply that: Object o; is almost always an error, and I think that making it illegal would catch the #1 trivial bug in D code.
Oct 07 2008
parent Benji Smith <dlanguage benjismith.net> writes:
Adam D. Ruppe wrote:
 On Tue, Oct 07, 2008 at 02:21:23PM +0200, Don wrote:
 My point is simply that:

 Object o;

 is almost always an error, and I think that making it illegal would 
 catch the #1 trivial bug in D code.

I agree here. When I first started with D, I would write that almost every time I was working with an object, and it would segfault without fail. In C++, writing Object o; is quite common, and as a C++ user coming to D, it took me a while to get away from that habit. I'm sure other C++ users have had the same experience. This change definitely seems like a good idea. If it breaks existing code, it is easy enough to add the = null if you need it, so there is little cost and I think a decent gain.

On the flip side, as a Java programmer, I write "Object o;" all the time, and it's generally NOT an error. For example: class MyObject { Object a; Object b; Object c; public this(Object a, Object b, Object c) { this.a = a; this.b = b; this.c = c; } } I like to leave object references uninitialized in my class member declarations, and then assign them in the constructor. I'd be disappointed if those kinds of declarations were illegal, and I was required to initialize them with "= null" every time. --benji
Oct 07 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
The two things that needs to be changed to support this feature are:

1) make typeof(null) == void*
2) remove default initializers (for reference types, at least)

The latter rule can be relaxed (as done in C#): you can have a variable  
uninitialized. However, you can't read from it until you initialize it  
explicitly. This is enforced statically:

// The following is ok:
Object o;
o = new Object();

// This one is ok, too:
Object o;
if (condition) {
    o = new Foo();
} else {
    o = new Bar();
}

// But this is rejected:
Object o;
if (condition) {
    o = new Foo();
}

Object o2 = o; // use of (possibly) uninitialized variable
Oct 02 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 06 Oct 2008 13:58:39 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 The two things that needs to be changed to support this feature are:
  1) make typeof(null) == void*
 2) remove default initializers (for reference types, at least)
  The latter rule can be relaxed (as done in C#): you can have a  
 variable uninitialized. However, you can't read from it until you  
 initialize it explicitly. This is enforced statically:
  // The following is ok:
 Object o;
 o = new Object();
  // This one is ok, too:
 Object o;
 if (condition) {
    o = new Foo();
 } else {
    o = new Bar();
 }
  // But this is rejected:
 Object o;
 if (condition) {
    o = new Foo();
 }
  Object o2 = o; // use of (possibly) uninitialized variable

Why not just disallow uninitialized references? So none of the examples would compile, unless you wrote: Object o = new Object(); or Object o = null; The statement that "50% of all bugs are of this type" is consistent with my experience. This wouldn't get all of them, but I reckon it'd catch most of them. I can't see any advantage from omitting the "=null;"

That's the whole point - you won't need to check for (o is null) ever. All the references are always valid. This might be a good contract to enforce. string foo(Bar o) { // should we check for null or not? return o.toString(); } Bar getBar() { if (...) { return new Bar(); } else { // should we throw an exception or return null? return null; } } // should we check for null here? string str = foo( getBar() ); Too much choices. The better code would be as follows: string foo(Bar o) { // no checking needed here, o can't be null! return o.toString(); } Bar? getBar() { if (...) { return new Bar(); } else { // ok, we can return null and user *knows* about that! // this is very important return null; } } string str = foo( getBar() ); // might throw a NullReference exception (just what we need!) alternatively you can make checking yourself: Bar? bar = getBar(); string str = (bar is null) ? "underfined" : foo(bar); // ok, it's safe Please, comment!
Oct 06 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 06 Oct 2008 15:34:02 +0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2008-10-06 06:13:38 -0400, "Denis Koroskin" <2korden gmail.com> said:

 string foo(Bar o)
 {
      // should we check for null or not?
      return o.toString();
 }
  Bar getBar()
 {
      if (...) {
          return new Bar();
      } else {
          // should we throw an exception or return null?
          return null;
      }
 }
  // should we check for null here?
 string str = foo( getBar() );
  Too much choices. The better code would be as follows:
  string foo(Bar o)
 {
      // no checking needed here, o can't be null!
      return o.toString();
 }
  Bar? getBar()
 {
      if (...) {
          return new Bar();
      } else {
          // ok, we can return null and user *knows* about that!
          // this is very important
          return null;
      }
 }
  string str = foo( getBar() ); // might throw a NullReference  
 exception  (just what we need!)
  alternatively you can make checking yourself:
  Bar? bar = getBar();
 string str = (bar is null) ? "underfined" : foo(bar); // ok, it's safe
  Please, comment!

I like it. And I think it should extend to any pointer. But would this: char* a; // non-null pointer become this: char*? b; // "?" makes the pointer nullable or this: char? b; // "?" means nullable pointer ? "char?" doesn't seem to work since then for consistency "Object?" should be a nullable pointer to an Object reference. But then, "char*?" isn't very appealing.

No, these are different types: char? <-> Nullable (char) char*? <-> Nullable (char*) This is for consistency. T? should be implicitly castable to T: int? c; // c is uninitialized and stores null assert(c is null); // int x = c; // throws an "Uninitialized value used" exception c = 42; assert(c !is null); int z = c; // ok because c is initialized Another example: // returns null on error // compare to C atoi behaviour - returns 0 on error int? atoi(string str);
 Oh, and does that mean we now have "const(Object)?" to create non-const  
 pointers to const objects? That'd be nice. :-)

 Perhaps "Object" should just be a shortcut for "Object*", which would  
 mean that you could write "const(Object)*" if you wanted, and "char?"  
 could work as a nullable pointer. But then, "==" wouldn't work right for  
 objects (because it'd have to compare pointers), am I right?

Oct 06 2008
prev sibling next sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
bearophile wrote:
 Interview to Anders Hejlsberg, one C# author:
 http://www.computerworld.com.au/index.php/id;1149786074;fp;;fpid;;pf;1
 
 50% of the bugs that people run into today, coding with C# in our platform,
and the same is true of Java for that matter, are probably null reference
exceptions. If we had had a stronger type system that would allow you to say
that 'this parameter may never be null, and you compiler please check that at
every call, by doing static analysis of the code'. Then we could have stamped
out classes of bugs.


Hum, it may be true that about 50% of bugs encountered are null pointer exceptions, but certainly 50% is not the percentage of the effort/time spent in debugging NPEs, as generally, NPEs are very easy to detect and to fix. As a very rough estimate, I'd say such effort is about 5%-20%. So, although I wouldn't say no to a non-nullable types feature, I don't think it would have that much of an impact. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 06 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 07 Oct 2008 13:33:16 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 On Mon, 06 Oct 2008 13:58:39 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 The two things that needs to be changed to support this feature are:
  1) make typeof(null) == void*
 2) remove default initializers (for reference types, at least)
  The latter rule can be relaxed (as done in C#): you can have a  
 variable uninitialized. However, you can't read from it until you  
 initialize it explicitly. This is enforced statically:
  // The following is ok:
 Object o;
 o = new Object();
  // This one is ok, too:
 Object o;
 if (condition) {
    o = new Foo();
 } else {
    o = new Bar();
 }
  // But this is rejected:
 Object o;
 if (condition) {
    o = new Foo();
 }
  Object o2 = o; // use of (possibly) uninitialized variable

Why not just disallow uninitialized references? So none of the examples would compile, unless you wrote: Object o = new Object(); or Object o = null; The statement that "50% of all bugs are of this type" is consistent with my experience. This wouldn't get all of them, but I reckon it'd catch most of them. I can't see any advantage from omitting the "=null;"

All the references are always valid. This might be a good contract to enforce.

I think you misunderstood me. I wrote "=null" not "==null". It's my experience that most null pointer bugs in D are caused by simply forgetting to initialize the reference. Anyone coming from C++ is extremely likely to do this. This bug could be completely eliminated by requiring objects to be initialized. If you want them to be be uninitialised, fine, set them equal to null. Make your intentions clear.

Well, there is no difference between Object o; and Object o = null; - they both will be initialized to null. But I agree that explicit initialization makes an intention more clear.
 Then, an object reference could only ever be null if it was explicitly  
 set to null.

Not, actually: void foo(Bar b) { Bar bb = b; // bb is null == b is null }
 There are some null pointer exceptions which are caused by returning a  
 null by accident, but they're much rarer, and they're not much different  
 to any other logic error.

function always returns valid reference is a good contract. User will be aware of the potential null returned.
 Yes, they would be eliminated in your proposal. But the cost-benefit is  
 not nearly as good as eliminating the default null initialisation; the  
 benefit is higher, but the cost is hundreds of times higher.

better optimizations (while ensuring high robustness) to what we have now. For example, now every function that accepts a class reference ought to check the pointer against null. With my proposal you should check just once! upon casting from T? to T. Once you got a reference, it is not null so no checking against null is necessary. And since you operate on T instead of T? all the time (and assigning from T? to T? doesn't need any checking either) it is rather enough to ignore the cost. As a result, your programs will become more robust. User null-checks only those references that are returned by 'unsafe' function (those that may return null). And these are rare enought by themselves. Besides, the assert inside implicit cast can be turned off in release version so it will be blazing fast while preserving the same degree of robustness. Here is another bonus: Imagine you write a Tango container class and it has the getIterator() method that used to construct new iterator each time and return the (valid) reference even if the container is empty. And now you deside to return null if the container is empty ("why to allocate memory if we are empty, anyway?", you though). How do users know that now your method may return null? They can't unless they read every library update notes. So now their code is now in danger! Reverse situation - your library function that used to return null in some cases now returns valid references (or throws an exception). Now users may stop null-checking the returned reference and get a small win. Note that there is no need to change the existing user code in this case and user may get a win for free!
Oct 07 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 07 Oct 2008 05:55:30 +0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2008-10-06 09:11:41 -0400, "Denis Koroskin" <2korden gmail.com> said:

 char? <-> Nullable (char)
 char*? <-> Nullable (char*)

Ok, I think see your point. We only got different interpretations. Mine was that only pointers (and object reference) could be made null, therfore, only pointers and object references could be nullable, but wouldn't be by default. Your idea is that any type can be made nullable. For pointers and object references, this would be represented by null pointers; for other types it'd have to be some kind of struct containing that type and a boolean null-indicator value. Is this correct?

Yes, although you don't have to use them. See http://www.nwcpp.org/Downloads/2006/The_Power_of_None.ppt
 Although I don't dislike this idea, I think nullable value-types offer  
 much less value than nullable pointers and references. For one, making  
 value-types nullable isn't a solution for the problem at hand -- null  
 pointer errors (access violation, segmentation faults, etc.) -- because  
 value-types already don't have this problem. For two, it makes the type  
 memory layout bigger instead of simply allowing it to hold the special  
 zero value.

Agree, but who said that zero is illegal integer value? How do I know whether the variable is already initialized or not?
 That said, I'm not against nullable value-types. My opinion is that  
 perhaps the language could be kept simpler by only allowing pointer and  
 object references to be nullable, because that's where it matters the  
 most.

Yes, you rarely need use of them, and in most cases they are local variables created on stack so the cost is minimal. That said, they are not crucial for the language, but it is better to have them for consistency with reference types.
Oct 07 2008
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Oct 07, 2008 at 02:21:23PM +0200, Don wrote:
 My point is simply that:
 
 Object o;
 
 is almost always an error, and I think that making it illegal would 
 catch the #1 trivial bug in D code.

I agree here. When I first started with D, I would write that almost every time I was working with an object, and it would segfault without fail. In C++, writing Object o; is quite common, and as a C++ user coming to D, it took me a while to get away from that habit. I'm sure other C++ users have had the same experience. This change definitely seems like a good idea. If it breaks existing code, it is easy enough to add the = null if you need it, so there is little cost and I think a decent gain. -- Adam D. Ruppe http://arsdnet.net
Oct 07 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 07 Oct 2008 16:21:23 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 On Tue, 07 Oct 2008 13:33:16 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 On Mon, 06 Oct 2008 13:58:39 +0400, Don <nospam nospam.com.au> wrote:

 Denis Koroskin wrote:
 The two things that needs to be changed to support this feature are:
  1) make typeof(null) == void*
 2) remove default initializers (for reference types, at least)
  The latter rule can be relaxed (as done in C#): you can have a  
 variable uninitialized. However, you can't read from it until you  
 initialize it explicitly. This is enforced statically:
  // The following is ok:
 Object o;
 o = new Object();
  // This one is ok, too:
 Object o;
 if (condition) {
    o = new Foo();
 } else {
    o = new Bar();
 }
  // But this is rejected:
 Object o;
 if (condition) {
    o = new Foo();
 }
  Object o2 = o; // use of (possibly) uninitialized variable

Why not just disallow uninitialized references? So none of the examples would compile, unless you wrote: Object o = new Object(); or Object o = null; The statement that "50% of all bugs are of this type" is consistent with my experience. This wouldn't get all of them, but I reckon it'd catch most of them. I can't see any advantage from omitting the "=null;"

ever. All the references are always valid. This might be a good contract to enforce.

I think you misunderstood me. I wrote "=null" not "==null". It's my experience that most null pointer bugs in D are caused by simply forgetting to initialize the reference. Anyone coming from C++ is extremely likely to do this. This bug could be completely eliminated by requiring objects to be initialized. If you want them to be be uninitialised, fine, set them equal to null. Make your intentions clear.

they both will be initialized to null. But I agree that explicit initialization makes an intention more clear.
 Then, an object reference could only ever be null if it was explicitly  
 set to null.

void foo(Bar b) { Bar bb = b; // bb is null == b is null }

But b must have been explicitly set to null. Someone has made a conscious decision that some reference is permitted to be null. Right now, it can happen by accident.
 There are some null pointer exceptions which are caused by returning a  
 null by accident, but they're much rarer, and they're not much  
 different to any other logic error.

function always returns valid reference is a good contract. User will be aware of the potential null returned.
 Yes, they would be eliminated in your proposal. But the cost-benefit  
 is not nearly as good as eliminating the default null initialisation;  
 the benefit is higher, but the cost is hundreds of times higher.


I wasn't talking about runtime cost, but cost to the language (increase in complexity of the spec, Walter's time). It requires _major_ language changes. New lexing, new types, new name mangling, new code generation.

No, it isn't! The two changes that are needed are: #1) Disallow Object o; #2) Make typeof(null) == void* so that Object o = null; would be an error. That's all! Everything else should be done at a library level (Nullable template). T? is just a syntax sugar and we can live without it (for time being). In fact, this will make language smaller and simpler. No need for T.init anymore!
 Actually, I thought that my proposal will lead to
 better optimizations (while ensuring high robustness) to what we have  
 now. For example, now every function that accepts a class reference  
 ought to check the pointer against null. With my proposal you should  
 check just once! upon casting from T? to T. Once you got a reference,  
 it is not null so no checking against null is necessary. And since you  
 operate on T instead of T? all the time (and assigning from T? to T?  
 doesn't need any checking either) it is rather enough to ignore the  
 cost.

True. I don't disagree that non-nullable references would be nice to have. I just think that it's unlikely that Walter would do it. My point is simply that: Object o; is almost always an error, and I think that making it illegal would catch the #1 trivial bug in D code.

Yes, this is part of this proposal (see #1) :) Let's start with #1 and introduce #2 later (if it fits the language).
Oct 07 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Tue, Oct 7, 2008 at 9:39 PM, Adam D. Ruppe <destructionator gmail.com> wrote:
 On Tue, Oct 07, 2008 at 02:21:23PM +0200, Don wrote:
 My point is simply that:

 Object o;

 is almost always an error, and I think that making it illegal would
 catch the #1 trivial bug in D code.

I agree here. When I first started with D, I would write that almost every time I was working with an object, and it would segfault without fail. In C++, writing Object o; is quite common, and as a C++ user coming to D, it took me a while to get away from that habit. I'm sure other C++ users have had the same experience. This change definitely seems like a good idea. If it breaks existing code, it is easy enough to add the = null if you need it, so there is little cost and I think a decent gain.

I think most of my bugs of this ilk come from aggregates: class Foo { this() { // oops forgot to init x here! } // lots of stuff //... private: Member x; } --bb
Oct 07 2008
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 07 Oct 2008 16:42:45 +0400, Bill Baxter <wbaxter gmail.com> wrote:

 On Tue, Oct 7, 2008 at 9:39 PM, Adam D. Ruppe  
 <destructionator gmail.com> wrote:
 On Tue, Oct 07, 2008 at 02:21:23PM +0200, Don wrote:
 My point is simply that:

 Object o;

 is almost always an error, and I think that making it illegal would
 catch the #1 trivial bug in D code.

I agree here. When I first started with D, I would write that almost every time I was working with an object, and it would segfault without fail. In C++, writing Object o; is quite common, and as a C++ user coming to D, it took me a while to get away from that habit. I'm sure other C++ users have had the same experience. This change definitely seems like a good idea. If it breaks existing code, it is easy enough to add the = null if you need it, so there is little cost and I think a decent gain.

I think most of my bugs of this ilk come from aggregates: class Foo { this() { // oops forgot to init x here! } // lots of stuff //... private: Member x; } --bb

Yeah, compiler should catch these. You should mark your Member as nullable to explicitly say that you won't initialize it in ctor.
Oct 07 2008