www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Non-null objects, the Null Object pattern, and T.init

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter and I were talking today about the null pointer issue and he had 
the following idea.

One common idiom to replace null pointer exceptions with milder 
reproducible errors is the null object pattern, i.e. there is one object 
that is used in lieu of the null reference to initialize all otherwise 
uninitialized references. In D that would translate naturally to:

class Widget
{
     private int x;
     private Widget parent;
     this(int y) { x = y; }
     ...
     // Here's the interesting part
     static Widget init = new Widget(42);
}

Currently the last line doesn't compile, but we can make it work if the 
respective constructor is callable during compilation. The compiler will 
allocate a static buffer for the "new"ed Widget object and will make 
init point there.

Whenever a Widget is to be default-initialized, it will point to 
Widget.init (i.e. it won't be null). This beautifully extends the 
language because currently (with no init definition) Widget.init is null.

So the init Widget will satisfy:

assert(x == 42 && parent is Widget.init);

Further avenues are opened by thinking what happens if e.g. init is 
private or  disable-d.

Thoughts?


Andrei
Jan 16 2014
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu 
wrote:
     static Widget init = new Widget(42);
The problem here is if it is in a mutable data segment: Widget i; // implicitly set to init assert(i is Widget.init); // uh oh i.mutate(); Widget i2; // set to widget init... assert(i is i2); // oh dear // i2 now reflects the mutation done above! And if the init is put in a read-only segment similar to a string buffer, we still get a segmentation fault when we try to mutate it. I'm not necessarily against having the option, but I think it would generally be more trouble than it is worth. If you have an object that you want to guarantee is not null, make the nullable implementation class Widget_impl {} then define alias Widget = NotNull!Widget_impl; and force initialization that way.
Jan 16 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/16/14 5:53 PM, Adam D. Ruppe wrote:
 On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu wrote:
     static Widget init = new Widget(42);
The problem here is if it is in a mutable data segment: Widget i; // implicitly set to init assert(i is Widget.init); // uh oh i.mutate(); Widget i2; // set to widget init... assert(i is i2); // oh dear // i2 now reflects the mutation done above!
Yah, that would be expected.
 I'm not necessarily against having the option, but I think it would
 generally be more trouble than it is worth. If you have an object that
 you want to guarantee is not null, make the nullable implementation

 class Widget_impl {}

 then define

 alias Widget = NotNull!Widget_impl;

 and force initialization that way.
Noted. Andrei
Jan 16 2014
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 17 January 2014 at 02:04:27 UTC, Andrei Alexandrescu 
wrote:
 Yah, that would be expected.
Yeah, but I think people would find it weird. This kind of thing is actually possible today: class Foo { } class Bar { Foo foo = new Foo(); // this gets a static reference into the typeinfo.init (since new Foo is evaluated at compile time!) which is blitted over... } void main() { auto bar = new Bar(); // bar.foo is blitted to point at the static Foo auto bar2 = new Bar(); // and same thing assert(bar.foo is bar2.foo); // passes, but that's kinda weird } Granted, maybe the weirdness here is because the variable isn't static, so people expect it to be different, but I saw at least one person on the chat room be very surprised by this - and I was too, until I thought about how CTFE and Classinfo.init is implemented, then it made sense, but at first glance it was a bit weird. I think people would be a bit surprised if it was "Foo foo;" as well using the proposed .init thingy.
Jan 16 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/16/14 6:11 PM, Adam D. Ruppe wrote:
 On Friday, 17 January 2014 at 02:04:27 UTC, Andrei Alexandrescu wrote:
 Yah, that would be expected.
Yeah, but I think people would find it weird. This kind of thing is actually possible today: class Foo { } class Bar { Foo foo = new Foo(); // this gets a static reference into the typeinfo.init (since new Foo is evaluated at compile time!) which is blitted over... }
That wouldn't be allowed. Andrei
Jan 16 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/16/2014 6:11 PM, Adam D. Ruppe wrote:
 I think people would be a bit surprised if it was "Foo foo;"
We really don't want any foo-foo stuff in D.
Jan 16 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/17/2014 03:11 AM, Adam D. Ruppe wrote:
 On Friday, 17 January 2014 at 02:04:27 UTC, Andrei Alexandrescu wrote:
 Yah, that would be expected.
Yeah, but I think people would find it weird. This kind of thing is actually possible today:
class Foo{ int x = 2; } class Bar{ auto foo = new Foo(); } void main(){ auto ibar = new immutable(Bar)(); auto bar = new Bar(); static assert(is(typeof(ibar.foo.x)==immutable)); assert(ibar.foo.x==2); bar.foo.x=3; assert(ibar.foo.x==3); // uh-oh! }
Jan 16 2014
parent Jacob Carlborg <doob me.com> writes:
On 2014-01-17 07:18, Timon Gehr wrote:

 class Foo{ int x = 2; }

 class Bar{ auto foo = new Foo(); }

 void main(){
      auto ibar = new immutable(Bar)();
      auto bar = new Bar();
      static assert(is(typeof(ibar.foo.x)==immutable));
      assert(ibar.foo.x==2);
      bar.foo.x=3;
      assert(ibar.foo.x==3); // uh-oh!
 }
Oh, that's just bad. Yet another hole in the type system. -- /Jacob Carlborg
Jan 17 2014
prev sibling next sibling parent reply "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu 
wrote:
 Walter and I were talking today about the null pointer issue 
 and he had the following idea.

 One common idiom to replace null pointer exceptions with milder 
 reproducible errors is the null object pattern, i.e. there is 
 one object that is used in lieu of the null reference to 
 initialize all otherwise uninitialized references. In D that 
 would translate naturally to:

 class Widget
 {
     private int x;
     private Widget parent;
     this(int y) { x = y; }
     ...
     // Here's the interesting part
     static Widget init = new Widget(42);
 }

 Currently the last line doesn't compile, but we can make it 
 work if the respective constructor is callable during 
 compilation. The compiler will allocate a static buffer for the 
 "new"ed Widget object and will make init point there.

 Whenever a Widget is to be default-initialized, it will point 
 to Widget.init (i.e. it won't be null). This beautifully 
 extends the language because currently (with no init 
 definition) Widget.init is null.

 So the init Widget will satisfy:

 assert(x == 42 && parent is Widget.init);

 Further avenues are opened by thinking what happens if e.g. 
 init is private or  disable-d.

 Thoughts?


 Andrei
I am assuming init will be a property static function. So essentially we would be removing .init support for classes in compiler and pushing it out into Object. I would rather if that is an option an init static function, that it would be required to have override otherwise it would be too 'magical' for me atleast. The default can still be null.
Jan 16 2014
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 17 January 2014 at 01:57:24 UTC, Rikki Cattermole 
wrote:
 I am assuming init will be a property static function. So 
 essentially we would be removing .init support for classes in 
 compiler and pushing it out into Object.
I don't think a function would work without major work since that would mean adding non-blit* default construction to the language - which isn't present at all right now.
Jan 16 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/16/14 5:57 PM, Rikki Cattermole wrote:
 I am assuming init will be a property static function.
In the example above it's just a static data member.
 So essentially we
 would be removing .init support for classes in compiler and pushing it
 out into Object.

 I would rather if that is an option an init static function, that it
 would be required to have override otherwise it would be too 'magical'
 for me atleast.
 The default can still be null.
Override only works for nonstatic methods. Andrei
Jan 16 2014
parent "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Friday, 17 January 2014 at 02:06:03 UTC, Andrei Alexandrescu 
wrote:
 I would rather if that is an option an init static function, 
 that it
 would be required to have override otherwise it would be too 
 'magical'
 for me atleast.
 The default can still be null.
Override only works for nonstatic methods.
I know it doesn't. Its more for making it nicer to read than anything. An actual property as you've shown would be nice. But having another place which can be abused, which you can write code for is nice.
Jan 16 2014
prev sibling next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu 
wrote:
 Whenever a Widget is to be default-initialized, it will point 
 to Widget.init (i.e. it won't be null). This beautifully 
 extends the language because currently (with no init 
 definition) Widget.init is null.
Hmm, what about derived classes? How do you check for a valid Widget given a DerivedWidget.init? class DerivedWidget : Widget { static DerivedWidget init = new DerivedWidget(...); } bool valid(Widget w) { return w !is Widget.init; } DerivedWidget foo; assert(!valid(foo)); // doesn't fire, foo is valid? The nice thing about null is that it is the bottom type, so it is a universal sentinel value. Also, what about interfaces? You cannot define an init for an interface. Obviously that could just be a known and accepted limitation of this proposal, but I'm not a huge fan of solutions that only work in a subset of situations. Perhaps there is a solution that I haven't thought of.
 assert(x == 42 && parent is Widget.init);
Is that meant to say "x is Widget.init"?
Jan 16 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/16/14 6:02 PM, Peter Alexander wrote:
 On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu wrote:
 Whenever a Widget is to be default-initialized, it will point to
 Widget.init (i.e. it won't be null). This beautifully extends the
 language because currently (with no init definition) Widget.init is null.
Hmm, what about derived classes? How do you check for a valid Widget given a DerivedWidget.init? class DerivedWidget : Widget { static DerivedWidget init = new DerivedWidget(...); } bool valid(Widget w) { return w !is Widget.init; } DerivedWidget foo; assert(!valid(foo)); // doesn't fire, foo is valid? The nice thing about null is that it is the bottom type, so it is a universal sentinel value.
This is a good point.
 Also, what about interfaces? You cannot define an init for an interface.
 Obviously that could just be a known and accepted limitation of this
 proposal, but I'm not a huge fan of solutions that only work in a subset
 of situations. Perhaps there is a solution that I haven't thought of.


 assert(x == 42 && parent is Widget.init);
Is that meant to say "x is Widget.init"?
To clarify: assert(Widget.init.x == 42 && Widget.init.parent is Widget.init); Andrei
Jan 16 2014
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 17 January 2014 at 02:08:07 UTC, Andrei Alexandrescu 
wrote:
 assert(x == 42 && parent is Widget.init);
Is that meant to say "x is Widget.init"?
To clarify: assert(Widget.init.x == 42 && Widget.init.parent is Widget.init);
Ah right, missed the parent parameter... Also ignore my previous comment in that case :-) However, I'm not sure this is good. Those kinds of reference cycles can easily lead to infinite loops. Is that really an improvement on a null-pointer dereference?
Jan 16 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/16/14 6:13 PM, Peter Alexander wrote:
 However, I'm not sure this is good. Those kinds of reference cycles can
 easily lead to infinite loops. Is that really an improvement on a
 null-pointer dereference?
You don't have to use the null object pattern for all classes. It's opt-in. Andrei
Jan 16 2014
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu
wrote:
 Walter and I were talking today about the null pointer issue 
 and he had the following idea.

 One common idiom to replace null pointer exceptions with milder 
 reproducible errors is the null object pattern, i.e. there is 
 one object that is used in lieu of the null reference to 
 initialize all otherwise uninitialized references. In D that 
 would translate naturally to:

 class Widget
 {
     private int x;
     private Widget parent;
     this(int y) { x = y; }
     ...
     // Here's the interesting part
     static Widget init = new Widget(42);
 }

 Currently the last line doesn't compile, but we can make it 
 work if the respective constructor is callable during 
 compilation. The compiler will allocate a static buffer for the 
 "new"ed Widget object and will make init point there.

 Whenever a Widget is to be default-initialized, it will point 
 to Widget.init (i.e. it won't be null). This beautifully 
 extends the language because currently (with no init 
 definition) Widget.init is null.

 So the init Widget will satisfy:

 assert(x == 42 && parent is Widget.init);

 Further avenues are opened by thinking what happens if e.g. 
 init is private or  disable-d.

 Thoughts?


 Andrei
Most object don't have a sensible init value. That is just hiding the problem under the carpet.
Jan 16 2014
next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 17 January 2014 at 02:07:08 UTC, deadalnix wrote:
 Most object don't have a sensible init value. That is just 
 hiding
 the problem under the carpet.
That's true. class Widget { Widget parent; static Widget init = ??? } How do you define init?
Jan 16 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/16/14 6:09 PM, Peter Alexander wrote:
 On Friday, 17 January 2014 at 02:07:08 UTC, deadalnix wrote:
 Most object don't have a sensible init value. That is just hiding
 the problem under the carpet.
That's true. class Widget { Widget parent; static Widget init = ??? } How do you define init?
Depends on what you want. Could be null or could have itself as a parent. The null object pattern is what it is. Andrei
Jan 16 2014
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

 Most object don't have a sensible init value. That is just 
 hiding the problem under the carpet.
If there's desire to solve this problem I think that improving the type system to avoid nulls where they are not desired is better than having an init object. So aren't not-nullable pointers and references a better solution? Bye, bearophile
Jan 16 2014
parent reply "inout" <inout gmail.com> writes:
On Friday, 17 January 2014 at 02:52:15 UTC, bearophile wrote:
 deadalnix:

 Most object don't have a sensible init value. That is just 
 hiding the problem under the carpet.
If there's desire to solve this problem I think that improving the type system to avoid nulls where they are not desired is better than having an init object. So aren't not-nullable pointers and references a better solution? Bye, bearophile
This! Also, if anything, it's better to turn `init` into a method rather than an object. The following would work all of a sudden: class Foo { Bar bar = new Bar(); int i = 42; Foo() { assert(bar !is null); assert(i == 42); } // auto-generated private final void init(Foo foo) { foo.bar = new Bar(); foo.i = 42; } }
Jan 16 2014
parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Friday, 17 January 2014 at 03:02:57 UTC, inout wrote:
 On Friday, 17 January 2014 at 02:52:15 UTC, bearophile wrote:
 deadalnix:

 Most object don't have a sensible init value. That is just 
 hiding the problem under the carpet.
If there's desire to solve this problem I think that improving the type system to avoid nulls where they are not desired is better than having an init object. So aren't not-nullable pointers and references a better solution? Bye, bearophile
This! Also, if anything, it's better to turn `init` into a method rather than an object. The following would work all of a sudden: class Foo { Bar bar = new Bar(); int i = 42; Foo() { assert(bar !is null); assert(i == 42); } // auto-generated private final void init(Foo foo) { foo.bar = new Bar(); foo.i = 42; } }
That would be indeed a nice solution and would break AFAIK nothing. :)
Jan 17 2014
parent "Namespace" <rswhite4 googlemail.com> writes:
On Friday, 17 January 2014 at 08:13:05 UTC, Namespace wrote:
 On Friday, 17 January 2014 at 03:02:57 UTC, inout wrote:
 On Friday, 17 January 2014 at 02:52:15 UTC, bearophile wrote:
 deadalnix:

 Most object don't have a sensible init value. That is just 
 hiding the problem under the carpet.
If there's desire to solve this problem I think that improving the type system to avoid nulls where they are not desired is better than having an init object. So aren't not-nullable pointers and references a better solution? Bye, bearophile
This! Also, if anything, it's better to turn `init` into a method rather than an object. The following would work all of a sudden: class Foo { Bar bar = new Bar(); int i = 42; Foo() { assert(bar !is null); assert(i == 42); } // auto-generated private final void init(Foo foo) { foo.bar = new Bar(); foo.i = 42; } }
That would be indeed a nice solution and would break AFAIK nothing. :)
But IMO even better would be something like this: ---- class A { int id; this(int id) { this.id = id; } static A init() { return new A(42); } } A a; /// <-- A a = A.init; --> A a = new A(42); ---- Define your own init method which initialize the object to not null.
Jan 17 2014
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-17 01:42:37 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Walter and I were talking today about the null pointer issue and he had 
 the following idea.
 
 [...]
 
 Thoughts?
Which null pointer issue were you discussing exactly? The one I'm mostly concerned about is the propagation of a null value from point A to point B in a program, where you only detect the null value at point B through a null dereference making it hard to figure out from where this unexpected null value comes from. Replace 'null' with 'sentinel' and it's no easier to figure out where the invalid value comes from. Except now instead of checking for null you have to check for null *and* for T.init *and* for all the various Sublcass.init you might get. The idea is interesting, but I see it creating more problems than it would solve. I'm not even sure I understand what problem it tries to solve. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 16 2014
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 16, 2014 at 11:00:24PM -0500, Michel Fortin wrote:
 On 2014-01-17 01:42:37 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
Walter and I were talking today about the null pointer issue and
he had the following idea.

[...]

Thoughts?
Which null pointer issue were you discussing exactly? The one I'm mostly concerned about is the propagation of a null value from point A to point B in a program, where you only detect the null value at point B through a null dereference making it hard to figure out from where this unexpected null value comes from. Replace 'null' with 'sentinel' and it's no easier to figure out where the invalid value comes from. Except now instead of checking for null you have to check for null *and* for T.init *and* for all the various Sublcass.init you might get. The idea is interesting, but I see it creating more problems than it would solve. I'm not even sure I understand what problem it tries to solve.
[...] AFAICT the null object pattern is intended to prevent dereferencing a null pointer (i.e., prevent a segfault on Posix), by providing a dummy object that has no-op stubs for methods and default values for fields. But from what I've been observing, what people have been asking for on this forum has been more a way of *preventing* null (or a sentinel, as here) from being assigned to a reference to begin with. So as far as that is concerned, this proposal doesn't really address the issue. Now, if we modify this sentinel to instead record the location of the code that first initialized it (via __FILE__ and __LINE__ default parameters perhaps), then we can set it up to print out this information at a convenient juncture, so that the source of the uninitialized reference can be determined. *Then* perhaps it will be a start of a solution to this issue. (Though it still has limitations in the sense that the problem can only be caught at runtime, whereas some cases of null dereference preferably should be caught at compile-time.) T -- Music critic: "That's an imitation fugue!"
Jan 16 2014
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 17 Jan 2014 05:29:05 -0000, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:
 Now, if we modify this sentinel to instead record the location of the
 code that first initialized it (via __FILE__ and __LINE__ default
 parameters perhaps), then we can set it up to print out this information
 at a convenient juncture, so that the source of the uninitialized
 reference can be determined. *Then* perhaps it will be a start of a
 solution to this issue. (Though it still has limitations in the sense
 that the problem can only be caught at runtime, whereas some cases of
 null dereference preferably should be caught at compile-time.)
So.. if we had a base class for all objects which obtained the file and line when created by assignment (from init) and threw on any dereference (opDispatch) that would do it, right? R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jan 17 2014
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 17, 2014 at 12:51:38PM -0000, Regan Heath wrote:
 On Fri, 17 Jan 2014 05:29:05 -0000, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
Now, if we modify this sentinel to instead record the location of the
code that first initialized it (via __FILE__ and __LINE__ default
parameters perhaps), then we can set it up to print out this
information at a convenient juncture, so that the source of the
uninitialized reference can be determined. *Then* perhaps it will be
a start of a solution to this issue. (Though it still has limitations
in the sense that the problem can only be caught at runtime, whereas
some cases of null dereference preferably should be caught at
compile-time.)
So.. if we had a base class for all objects which obtained the file and line when created by assignment (from init) and threw on any dereference (opDispatch) that would do it, right?
[...] If Andrei's proposal were extended so that .init can be overridden by a member *function*, then this would work: class NullTracker { override typeof(this) init(string _file=__FILE__, size_t _line = __LINE__) { class Impl : NullTracker { string file; size_t line; this(string f, size_t l) { file=f; line=l; } override void method1() { nullDeref(); } override void method2() { nullDeref(); } ... void nullDeref() { // N.B.: puts the *source* of the // null in the Exception. throw new Exception( "Null dereference", file, line); } } return new Impl(_file, _line); } void method1() {} void method2() {} ... } T -- People walk. Computers run.
Jan 17 2014
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 17, 2014 at 10:00:26AM -0800, H. S. Teoh wrote:
[...]
 If Andrei's proposal were extended so that .init can be overridden by a
 member *function*, then this would work:
 
 	class NullTracker {
 		override typeof(this) init(string _file=__FILE__,
 						size_t _line = __LINE__)
 		{
 			class Impl : NullTracker {
 				string file;
 				size_t line;
 				this(string f, size_t l) { file=f; line=l; }
 
 				override void method1() { nullDeref(); }
 				override void method2() { nullDeref(); }
 				...
 				void nullDeref() {
 					// N.B.: puts the *source* of the
 					// null in the Exception.
 					throw new Exception(
 						"Null dereference",
 						file, line);
[...] P.S. A better message might be "Null dereference, uninitialized object from %s:%d", to distinguish the site of the null dereference vs. its ultimate source. T -- "How are you doing?" "Doing what?"
Jan 17 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 17 January 2014 at 05:30:45 UTC, H. S. Teoh wrote:
 Now, if we modify this sentinel to instead record the location 
 of the
 code that first initialized it (via __FILE__ and __LINE__ 
 default
 parameters perhaps), then we can set it up to print out this 
 information
 at a convenient juncture, so that the source of the 
 uninitialized
 reference can be determined. *Then* perhaps it will be a start 
 of a
Which is easy to do (in theory). Page zero is over a thousand unique adresses. All you have to do is to convert null-tests to a range test. E.g. (addr&MASK)==0
Jan 19 2014
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Jan 19, 2014 at 09:41:20PM +0000, digitalmars-d-bounces puremagic.com
wrote:
 On Friday, 17 January 2014 at 05:30:45 UTC, H. S. Teoh wrote:
Now, if we modify this sentinel to instead record the location of the
code that first initialized it (via __FILE__ and __LINE__ default
parameters perhaps), then we can set it up to print out this
information at a convenient juncture, so that the source of the
uninitialized reference can be determined. *Then* perhaps it will be
a start of a
Which is easy to do (in theory). Page zero is over a thousand unique adresses. All you have to do is to convert null-tests to a range test. E.g. (addr&MASK)==0
It's not that simple. In assembly, usually the pointer value is just a base address, you add the field offset on top of that before you even attempt to load anything from memory. So you can't use "different nulls" to represent different source locations that easily. T -- There's light at the end of the tunnel. It's the oncoming train.
Jan 19 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 20 January 2014 at 01:22:34 UTC, H. S. Teoh wrote:
 adresses. All you have to do is to convert null-tests to a 
 range
 test. E.g. (addr&MASK)==0
It's not that simple. In assembly, usually the pointer value is just a base address, you add the field offset on top of that before you even attempt to load anything from memory. So you can't use "different nulls" to represent different source locations that easily.
Why not? You don't trap address zero, the MMU traps pages of 4K size on x86. But when you have an explicit nullptr test you will have to mask it before testing the zero-flag in the control register.
Jan 20 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 20 January 2014 at 12:20:58 UTC, Ola Fosheim Grøstad 
wrote:
 But when you have an explicit nullptr test you will have to 
 mask it before testing the zero-flag in the control register.
And just to make it explicit: you will have to add the masking logic to all comparisons of nullable pointers too if you want to allow comparison of null to be valid. aptr==bptr => ((aptr&MASK==0)&&(bptr&MASK==0))||aptr==bptr or something like that.
Jan 20 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/16/2014 5:42 PM, Andrei Alexandrescu wrote:
 Walter and I were talking today about the null pointer issue and he had the
 following idea.

 One common idiom to replace null pointer exceptions with milder reproducible
 errors is the null object pattern, i.e. there is one object that is used in
lieu
 of the null reference to initialize all otherwise uninitialized references. In
D
 that would translate naturally to:

 class Widget
 {
      private int x;
      private Widget parent;
      this(int y) { x = y; }
      ...
      // Here's the interesting part
      static Widget init = new Widget(42);
 }
I was thinking of: property static Widget init() { ... }
Jan 16 2014
parent Artur Skawina <art.08.09 gmail.com> writes:
On 01/17/14 06:33, Walter Bright wrote:
 On 1/16/2014 5:42 PM, Andrei Alexandrescu wrote:
 Walter and I were talking today about the null pointer issue and he had the
 following idea.

 One common idiom to replace null pointer exceptions with milder reproducible
 errors is the null object pattern, i.e. there is one object that is used in
lieu
 of the null reference to initialize all otherwise uninitialized references. In
D
 that would translate naturally to:

 class Widget
 {
      private int x;
      private Widget parent;
      this(int y) { x = y; }
      ...
      // Here's the interesting part
      static Widget init = new Widget(42);
 }
I was thinking of: property static Widget init() { ... }
default() { ... } because a) you've just reinvented default construction; b) overloading 'init' like that is a bad idea (should this 'init()' be called before invoking a "normal" ctor?...) artur
Jan 17 2014
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/17/2014 02:42 AM, Andrei Alexandrescu wrote:
 class Widget
 {
      private int x;
      private Widget parent;
      this(int y) { x = y; }
      ...
      // Here's the interesting part
      static Widget init = new Widget(42);
 }

 Currently the last line doesn't compile, but we can make it work if the
 respective constructor is callable during compilation. The compiler will
 allocate a static buffer for the "new"ed Widget object and will make
 init point there.
immutable(Widget) iwidget; // ?
Jan 16 2014
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2014-01-17 02:42, Andrei Alexandrescu wrote:

 Thoughts?
As others have said, I don't really see the point in this. It just replaces the existing sentinel value (null) with a different one. -- /Jacob Carlborg
Jan 17 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu 
wrote:
 Walter and I were talking today about the null pointer issue 
 and he had the following idea.

 One common idiom to replace null pointer exceptions with milder 
 reproducible errors is the null object pattern, i.e. there is 
 one object that is used in lieu of the null reference to 
 initialize all otherwise uninitialized references. In D that 
 would translate naturally to:

 class Widget
 {
     private int x;
     private Widget parent;
     this(int y) { x = y; }
     ...
     // Here's the interesting part
     static Widget init = new Widget(42);
 }

 Currently the last line doesn't compile, but we can make it 
 work if the respective constructor is callable during 
 compilation. The compiler will allocate a static buffer for the 
 "new"ed Widget object and will make init point there.

 Whenever a Widget is to be default-initialized, it will point 
 to Widget.init (i.e. it won't be null). This beautifully 
 extends the language because currently (with no init 
 definition) Widget.init is null.

 So the init Widget will satisfy:

 assert(x == 42 && parent is Widget.init);

 Further avenues are opened by thinking what happens if e.g. 
 init is private or  disable-d.

 Thoughts?


 Andrei
Throwing out some more ideas, though non of this compiles right now (and I managed to make the compiler ICE). class Widget { private int x; this(int y) { x = y; } public property int halfOfX() { return x / 2; } public void printHello() { std.stdio.writeln("Hello"); } //Error: 'this' is only defined in non-static member functions, //not Widget static Widget init = new class Widget { disable public override property int halfOfX(); disable public override void printHello(); this() { super(int.init); } }; } Another idea that unfortunately doesn't work due to the way static assert works: class Widget { private int x; this(int y) { x = y; } public property int halfOfX() { return x / 2; } public void printHello() { std.stdio.writeln("Hello"); } static Widget init = new class Widget { //Error: static assert "Tried to access halfOfX //property of null Widget" override property int halfOfX() { static assert(false, "Tried to access halfOfX property of null Widget"); } override void printHello() { static assert(false, "Tried to access halfOfX property of null Widget"); } this() { super(int.init); } }; } Is there any way to get around this? And then there's this, which causes an ICE: class Widget { private int x; this(int y) { x = y; } public property int halfOfX() { return x / 2; } public void printHello() { std.stdio.writeln("Hello"); } static Widget init = new class Widget { this() { } }; }
Jan 17 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu 
wrote:
 One common idiom to replace null pointer exceptions with milder 
 reproducible errors is the null object pattern, i.e. there is
Usually null failures are hard to track when some function returned a null value as an indication of error when the programmer expected an exception. In that case early failure is better. I guess it cold make sense with "weak" pointers if they are wiped by the GC. Like if you delete a User object, then all "weak" pointers are set to a UnknownUser default object. So that you can call user.isLoggedIn() and get false, but you should be able to default to a different Class to get a different set of virtual calls. I could also see the use-case for lazy allocation. Basically having memory-cheap "smart" pointers that are checked when you trap zero-page access and automatically alloc/init an empty object before recovering. But the overhead is large. Not sure if init() is the right way to do it though, because you might need something more flexible. E.g. maybe you want access to the node that contained the offending null-pointer.
Jan 17 2014
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 17 January 2014 at 14:06:59 UTC, Ola Fosheim Grøstad 
wrote:
 Like if you delete a User object, then all "weak" pointers are 
 set to a UnknownUser default object. So that you can call 
 user.isLoggedIn() and get false, but you should be able to 
 default to a different Class to get a different set of virtual 
 calls.
This particular thing is easy enough to do with null too: bool isLoggedIn(User user) { return user is null ? false : user.isLoggedInImpl(); }
Jan 17 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 17 January 2014 at 14:13:40 UTC, Adam D. Ruppe wrote:
 bool isLoggedIn(User user) {
     return user is null ? false : user.isLoggedInImpl();
 }
In most cases you can avoid null-related issues if you plan for it, but if you adapt existing software you might not afford to do the changes. Even simple adaption of a framework can cost many weeks of hard work and then you need to keep it in sync with future versions of the framework. I also think there might be cases where you run the null-test a lot of times with the same result and would be better without it. Like in deep nested data structures where the pointer almost never is null, and then a trap with recovery might be cheaper, amortized. Sometimes you can write more efficient datastructures if you ban deletion: Like: ptr = someobject; weakcontainer.add(ptr); ... ptr.setdeleted(); ptr = null; ... weakptr = weakcontainer.get(id); if(weakptr.isValid())...; In this case weakptr would point to the original object before GC, and to the init object after GC. Saving space, but not requiring changes to the weakcontainer implementation.
Jan 17 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:06 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu wrote:
 One common idiom to replace null pointer exceptions with milder reproducible
 errors is the null object pattern, i.e. there is
Usually null failures are hard to track when some function returned a null value as an indication of error when the programmer expected an exception.
I've almost never had a problem tracking down the cause of a null pointer. Usually just a few minutes with a debugger and getting a backtrace.
Jan 17 2014
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 17 January 2014 at 19:43:58 UTC, Walter Bright wrote:
 I've almost never had a problem tracking down the cause of a 
 null pointer. Usually just a few minutes with a debugger and 
 getting a backtrace.
It's not always easy to run a debugger on it. I just had an exception (not even a segfault) take two days to track down since, 1) the std.variant exception messages SUCK and 2) after finding where the exception was coming from, it was still hard to find why the input was weird. A bad value from a third party server (generated with the one specific user's request) got stored and used a bit later. The problem wasn't where it was used: it was where it came from. And that's what the NotNull seeks to help.
Jan 17 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 17 January 2014 at 19:43:58 UTC, Walter Bright wrote:
 I've almost never had a problem tracking down the cause of a 
 null pointer. Usually just a few minutes with a debugger and 
 getting a backtrace.
Doesn't work if the unexpected "null" sits in a graph and the source of it is hard to pinpoint or occurs "randomly". E.g. if you are using a "black box" framework or it happens spuriously on a server because it is triggered by a database timeout which never happens on the dev server.
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 12:35 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Friday, 17 January 2014 at 19:43:58 UTC, Walter Bright wrote:
 I've almost never had a problem tracking down the cause of a null pointer.
 Usually just a few minutes with a debugger and getting a backtrace.
Doesn't work if the unexpected "null" sits in a graph and the source of it is hard to pinpoint or occurs "randomly". E.g. if you are using a "black box" framework or it happens spuriously on a server because it is triggered by a database timeout which never happens on the dev server.
As I replied elsewhere, tracking down the source of a bad value in any variable is a standard debugging problem. There isn't anything special about null in this regard.
Jan 17 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 1:31 PM, Walter Bright wrote:
 On 1/17/2014 12:35 PM, "Ola Fosheim Grøstad"
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Friday, 17 January 2014 at 19:43:58 UTC, Walter Bright wrote:
 I've almost never had a problem tracking down the cause of a null
 pointer.
 Usually just a few minutes with a debugger and getting a backtrace.
Doesn't work if the unexpected "null" sits in a graph and the source of it is hard to pinpoint or occurs "randomly". E.g. if you are using a "black box" framework or it happens spuriously on a server because it is triggered by a database timeout which never happens on the dev server.
As I replied elsewhere, tracking down the source of a bad value in any variable is a standard debugging problem. There isn't anything special about null in this regard.
One problem with null is it's not "proportional response", i.e. it takes the application in the back and shoots it in the head instead of e.g. a stale display. That _does_ make null special; there's a large category of application for which aborting is simply not an option. Andrei
Jan 17 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 4:15 PM, Andrei Alexandrescu wrote:
 One problem with null is it's not "proportional response", i.e. it takes the
 application in the back and shoots it in the head instead of e.g. a stale
 display. That _does_ make null special; there's a large category of application
 for which aborting is simply not an option.
I'm aware of that, I was responding to the idea that null pointers are harder to track down once detected. In particular, I don't see how a stale display is an easier bug to track down, although granted it is less annoying to the end user.
Jan 17 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 17 January 2014 at 21:31:16 UTC, Walter Bright wrote:
 As I replied elsewhere, tracking down the source of a bad value 
 in any variable is a standard debugging problem. There isn't 
 anything special about null in this regard.
It isn't special in the theoretical sense, but since it in practice is used for a wide variety of things it becomes a sensitive issue in situations where you cannot replay input. Especially since it is used for linking together subsystems. In multi user games you can do fine with wrong "float values", objects might drift, you might get odd effects, but those bugs can be cool. They might even become features. A wrong "null" makes the client or server shut down, it is disruptive to the service. So what can you do? You can log all input events in a ring buffer and send a big dump to the developers when the game client crash and try to find correlation between stack trace and input events (if you get multiple reports). And on a server with say 1000 concurrent users that interact and trigger a variety of bugs in one big multi-threaded global state world sim… You want to use a safe language for world content, where you disable features if needed, but keep the world running. The same goes for a web service. You don't want to shut down the whole service just because the "about page" handler fails to test for null. Big systems have to live with bugs, it is inevitable that they run with bugs. Erlang was created for stuff like that. Can you imagine a telephone central going down just because the settings for one subscriber was wrong and triggered a bug?
Jan 17 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 4:44 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 It isn't special in the theoretical sense, but since it in practice is used for
 a wide variety of things  it becomes a sensitive issue in situations where you
 cannot replay input. Especially since it is used for linking together
 subsystems. In multi user games you can do fine with wrong "float values",
 objects might drift, you might get odd effects, but those bugs can be cool.
They
 might even become features. A wrong "null" makes the client or server shut
down,
 it is disruptive to the service.

 So what can you do? You can log all input events in a ring buffer and send a
big
 dump to the developers when the game client crash and try to find correlation
 between stack trace and input events (if you get multiple reports).  And on a
 server with say 1000 concurrent users that interact and trigger a variety of
 bugs in one big multi-threaded global state world sim… You want to use a safe
 language for world content, where you disable features if needed, but keep the
 world running.

 The same goes for a web service. You don't want to shut down the whole service
 just because the "about page" handler fails to test for null.

 Big systems have to live with bugs, it is inevitable that they run with bugs.
 Erlang was created for stuff like that. Can you imagine a telephone central
 going down just because the settings for one subscriber was wrong and triggered
 a bug?
First off, in all these scenarios you describe, how does not having a null make it EASIER to track down the bug? Next, it's one thing to have a game degrade but keep playing if there's a bug in the code. It's quite another to have a critical system start behaving erratically because of a bug. Remember, a bug means the program has entered an undefined, unanticipated state. If it is a critical system, continuing to run that system is absolutely, positively, the wrong thing to do. The correct way is when the bug is detected, that software is IMMEDIATELY SHUT DOWN and the backup is engaged. If you don't have such a backup, you have a very, very badly designed system. I'm not just making that stuff up; it is not my uninformed opinion. This is how, for example, Boeing designs things for airliners. You wouldn't want it any other way. Fukushima and Deepwater Horizon are what happens when you don't have a system with independent backups. Again, granted, this doesn't apply to a game. Nothing bad happens when a game goes awry. But it definitely applies when there's money involved, like transaction processing, or trading software. And it certainly applies when failure means people die.
Jan 17 2014
next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-18 01:22:49 +0000, Walter Bright <newshound2 digitalmars.com> said:

 First off, in all these scenarios you describe, how does not having a 
 null make it EASIER to track down the bug?
Implemented well, it makes it a compilation error. It works like this: - can't pass a likely-null value to a function that wants a not-null argument. - can't assign a likely-null value to a not-null variable. - can't dereference a likely-null value. You have to check for null first, and the check changes the value from likely-null to not-null in the branch taken when the pointer is valid. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:18 PM, Michel Fortin wrote:
 Implemented well, it makes it a compilation error. It works like this:

 - can't pass a likely-null value to a function that wants a not-null argument.
 - can't assign a likely-null value to a not-null variable.
 - can't dereference a likely-null value.

 You have to check for null first, and the check changes the value from
 likely-null to not-null in the branch taken when the pointer is valid.
I was talking about runtime errors, in that finding the cause of a runtime null failure is not harder than finding the cause of any other runtime invalid value. We all agree that detecting bugs at compile time is better.
Jan 17 2014
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-18 02:41:58 +0000, Walter Bright <newshound2 digitalmars.com> said:

 We all agree that detecting bugs at compile time is better.
I guess so. But you're still unconvinced it's worth it to eliminate null dereferences? -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 7:23 PM, Michel Fortin wrote:
 I guess so. But you're still unconvinced it's worth it to eliminate null
 dereferences?
I think it's a more nuanced problem than that. But I agree that compile time detect of null reference bugs is better than runtime detection of them.
Jan 17 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 7:38 PM, Walter Bright wrote:
 But I agree that compile time
 detect of null reference bugs is better than runtime detection of them.
BTW, the following program: class C { int a,b; } int test() { C c; return c.b; } When compiled with -O: foo.d(6): Error: null dereference in function _D3foo4testFZi It isn't much, only working on intra-function analysis and only when the optimizer is used, but it's something. It's been in dmd for a long time.
Jan 17 2014
next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Saturday, 18 January 2014 at 06:10:20 UTC, Walter Bright wrote:
 On 1/17/2014 7:38 PM, Walter Bright wrote:
 But I agree that compile time
 detect of null reference bugs is better than runtime detection 
 of them.
BTW, the following program: class C { int a,b; } int test() { C c; return c.b; } When compiled with -O: foo.d(6): Error: null dereference in function _D3foo4testFZi It isn't much, only working on intra-function analysis and only when the optimizer is used, but it's something. It's been in dmd for a long time.
But: ---- class C { int a, b; } C create() pure nothrow { return null; } int test() pure nothrow { C c = create(); return c.b; } void main() { test(); } ---- Print nothing, even with -O. Maybe the idea to use C? for nullable references would be worth to implement.
Jan 18 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 2:02 AM, Namespace wrote:
 Print nothing, even with -O.
Right. As I explained, it is only intra-function analysis.
Jan 18 2014
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-01-18 07:10, Walter Bright wrote:

 BTW, the following program:

    class C { int a,b; }

    int test() {
      C c;
      return c.b;
    }

 When compiled with -O:

    foo.d(6): Error: null dereference in function _D3foo4testFZi

 It isn't much, only working on intra-function analysis and only when the
 optimizer is used, but it's something. It's been in dmd for a long time.
Why only when the optimizer is used? -- /Jacob Carlborg
Jan 18 2014
next sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Jacob Carlborg"  wrote in message news:lbdkvv$2f1n$1 digitalmars.com... 
 Why only when the optimizer is used?
The code that detects the null dereference is in the optimizer.
Jan 18 2014
parent reply Jacob Carlborg <doob me.com> writes:
On 2014-01-18 11:41, Daniel Murphy wrote:

 The code that detects the null dereference is in the optimizer.
Why is it located there? -- /Jacob Carlborg
Jan 18 2014
parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Saturday, 18 January 2014 at 11:03:22 UTC, Jacob Carlborg 
wrote:
 On 2014-01-18 11:41, Daniel Murphy wrote:

 The code that detects the null dereference is in the optimizer.
Why is it located there?
It's probably flow sensitive and only the optimizer builds a CFG.
Jan 18 2014
parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Tobias Pankrath"  wrote in message 
news:rxnlyzyfkzndzfslvqzp forum.dlang.org...
 Why is it located there?
It's probably flow sensitive and only the optimizer builds a CFG.
Exactly. It _could_ be done in the frontend, but it hasn't.
Jan 18 2014
parent Jacob Carlborg <doob me.com> writes:
On 2014-01-18 12:21, Daniel Murphy wrote:

 Exactly.  It _could_ be done in the frontend, but it hasn't.
I see, thanks. -- /Jacob Carlborg
Jan 18 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 2:28 AM, Jacob Carlborg wrote:
 Why only when the optimizer is used?
Yes, because the optimizer uses data flow analysis, and this falls out of that.
Jan 18 2014
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-18 03:38:31 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 1/17/2014 7:23 PM, Michel Fortin wrote:
 I guess so. But you're still unconvinced it's worth it to eliminate null
 dereferences?
I think it's a more nuanced problem than that. But I agree that compile time detect of null reference bugs is better than runtime detection of them.
In the C++ project I'm working on (full of asynchronous callbacks) I've built myself a not-null smart pointer type. It tries to disallow things at compile time, but because it can't follow control flow there's obviously some possible leaks that are caught at runtime through an assertion upon assignment. I created it because I had some untraceable bugs and I though it'd be simpler to convert the code base to use not-null pointers than it'd be to narrow the issues manually. The transition was surprisingly easy: just try to make everything you can not-nullable. Sometime the logic forces you to make a variable nullable, sometime you discover a bug in the form of a missing null check. Implementing non-nullable pointers and adapting the code base was worth it for me. I found some bugs that I couldn't track otherwise, and some others that had yet to occur. But more importantly now when I look at a pointer in this project I know immediately from its declaration whether that pointer can be null or not, and I find that very helpful when tweaking code that was written a while ago. But in the process I've created a new meta-language (make_new< T > instead of new, a new smart pointer types, cast function for those pointers, etc.) that someone new on the project will have to learn and train himself to use correctly. Integrating the thing in the language would make things that much simpler because it'd preserve the normal syntax of the language and also because null leaks can all be detected at compile-time, with clearer error messages. This is what motivated my proposal earlier in this thread. I'm proposing an improved version of what I'm already using currently: http://forum.dlang.org/thread/lba1qe$1sih$1 digitalmars.com?page=4#post-lbbgr6:247sf:241:40digitalmars.com I'd like to know what you think of it. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 18 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 7:38 PM, Walter Bright wrote:
 On 1/17/2014 7:23 PM, Michel Fortin wrote:
 I guess so. But you're still unconvinced it's worth it to eliminate null
 dereferences?
I think it's a more nuanced problem than that.
I should clarify that. I oppose solutions that replace the null seg fault with another bug that is harder to detect. (I know this is not what you have proposed.)
Jan 18 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
I somehow missed this post, so a delayed response.

On Saturday, 18 January 2014 at 01:22:48 UTC, Walter Bright wrote:
 First off, in all these scenarios you describe, how does not 
 having a null make it EASIER to track down the bug?
I have not argued for not having a null. I have argued for trapping null, instantiating a type specific default and recover if the type's meta says so. That default could be to issue a NullException. At that point you should be able to log null dereferences too. I have previously argued that non-null pointers is nice, but only if you have whole program analysis. Otherwise you'll end up injecting null-tests everywhere.
 there's a bug in the code. It's quite another to have a 
 critical system start behaving erratically because of a bug.
Yes, and this is a problematic definition since "erratic" is kind of subjective given that most systems don't follow the model 100%.
 do. The correct way is when the bug is detected, that software 
 is IMMEDIATELY SHUT DOWN and the backup is engaged. If you 
 don't have such a backup, you have a very, very badly designed 
 system.
In that case 99.99% of all software is very, very badly designed. DMD inclusive. Human beings too, but we happen to be fault tolerant, not by using backups, but by being good at recovery and dealing in a fuzzy way with incomplete or wrong information.
Jan 18 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 3:34 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 do. The correct way is when the bug is detected, that software is IMMEDIATELY
 SHUT DOWN and the backup is engaged. If you don't have such a backup, you have
 a very, very badly designed system.
In that case 99.99% of all software is very, very badly designed. DMD inclusive.
You elided the qualification "If it is a critical system". dmd is not a safety critical application.
Jan 18 2014
next sibling parent reply "Kapps" <opantm2+spam gmail.com> writes:
On Sunday, 19 January 2014 at 02:33:15 UTC, Walter Bright wrote:
 You elided the qualification "If it is a critical system". dmd 
 is not a safety critical application.
Nor are 99.99%, possibly 100%, of the applications currently being built with D. The points are valid, no safety critical application should ever rely on any individual components not failing. But for the purpose of D, they are not particularly applicable. People are just arguing over two different things (safety critical code vs standard D code), with neither disagreeing with the others, simply bringing up different situations.
Jan 18 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 11:11 PM, Kapps wrote:
 On Sunday, 19 January 2014 at 02:33:15 UTC, Walter Bright wrote:
 You elided the qualification "If it is a critical system". dmd is not a safety
 critical application.
Nor are 99.99%, possibly 100%, of the applications currently being built with D.
Sociomantic uses D to write trading software. I think they'd be ill advised to write code in such a way that it continues making trades after entering an invalid state, as in it could be very expensive. Furthermore, part of the reason why I am adamant about this is far too often I run into, including in this thread, programmers who believe that the way to write critical software is to keep the program running even if it has failed.
 The points are valid, no safety critical application should ever rely on any
 individual components not failing. But for the purpose of D, they are not
 particularly applicable.
I don't buy that this is not applicable for D. D must not promote unsafe (as in potentially life threatening) programming practice as a "best practice". You never know how somebody is going to use a programming language.
 People are just arguing over two different things
 (safety critical code vs standard D code), with neither disagreeing with the
 others, simply bringing up different situations.
There has been a lot of misunderstandings and miscommunications in this thread. I do my best to clear them up.
Jan 18 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 6:33 PM, Walter Bright wrote:
 You elided the qualification "If it is a critical system". dmd is not a safety
 critical application.
And I still practice what I preach with DMD. DMD never attempts to continue running after it detects that it has entered an invalid state - it ceases immediately. Furthermore, when it detects any error in the source code being compiled, it does not generate an object file.
Jan 18 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 19 January 2014 at 07:40:09 UTC, Walter Bright wrote:
 On 1/18/2014 6:33 PM, Walter Bright wrote:
 You elided the qualification "If it is a critical system". dmd 
 is not a safety critical application.
And I still practice what I preach with DMD. DMD never attempts to continue running after it detects that it has entered an invalid state - it ceases immediately. Furthermore, when it detects any error in the source code being compiled, it does not generate an object file.
I think the whole "critical system" definition is rather vague. For safety critical applications you want proven implementation technology, proper tooling and a methodology to go with it. And it is very domain specific. Simple algorithms can be proven correct, some types of signal processing can be proven correct/stable, some types of implementations (like a FPGA) affords exhaustive testing (test all combination of input). In the case of D, I find that a somewhat theoretical argument. D is not a proven technology. D does not have tooling with a methodology to go with it. But yes, you want backups due to hardware failure even for programs that are proven correct. In a telephone central you might want to have a backup system to handle emergency calls. If you take a theoretical position (which I think you do) then I also think you should accept a theoretical argument. And the argument is that there is no theoretical difference between allowing programs with known bugs to run and allowing programs with anticipated bugs to run (e.g. catching "bottom" in a subsystem). There is also no theoretical difference between allowing DMD to generate code that is not following the spec 100%, and allowing DMD to generate code if an anticipated "bottom" occurs. It all depends on what degree of deviance from the specified model you accept. It is quite acceptable to catch "bottom" for an optimizer and generate less optimized code for that function, or to turn off that optimizer setting. However, in a compiler you can defer to "the pilot" (compiler) so that is generally easier. In a server you can't.
Jan 19 2014
next sibling parent reply "Paolo Invernizzi" <paolo.invernizzi gmail.com> writes:
On Sunday, 19 January 2014 at 12:20:42 UTC, Ola Fosheim Grøstad 
wrote:
 On Sunday, 19 January 2014 at 07:40:09 UTC, Walter Bright wrote:
 On 1/18/2014 6:33 PM, Walter Bright wrote:
 You elided the qualification "If it is a critical system". 
 dmd is not a safety critical application.
And I still practice what I preach with DMD. DMD never attempts to continue running after it detects that it has entered an invalid state - it ceases immediately. Furthermore, when it detects any error in the source code being compiled, it does not generate an object file.
I think the whole "critical system" definition is rather vague. For safety critical applications you want proven implementation technology, proper tooling and a methodology to go with it. And it is very domain specific. Simple algorithms can be proven correct, some types of signal processing can be proven correct/stable, some types of implementations (like a FPGA) affords exhaustive testing (test all combination of input). In the case of D, I find that a somewhat theoretical argument. D is not a proven technology. D does not have tooling with a methodology to go with it. But yes, you want backups due to hardware failure even for programs that are proven correct. In a telephone central you might want to have a backup system to handle emergency calls. If you take a theoretical position (which I think you do) then I also think you should accept a theoretical argument. And the argument is that there is no theoretical difference between allowing programs with known bugs to run and allowing programs with anticipated bugs to run (e.g. catching "bottom" in a subsystem). There is also no theoretical difference between allowing DMD to generate code that is not following the spec 100%, and allowing DMD to generate code if an anticipated "bottom" occurs. It all depends on what degree of deviance from the specified model you accept. It is quite acceptable to catch "bottom" for an optimizer and generate less optimized code for that function, or to turn off that optimizer setting. However, in a compiler you can defer to "the pilot" (compiler) so that is generally easier. In a server you can't.
I'm trying to understand your motivations, but why in a server you can't? I still can't grasp that point. -- Paolo
Jan 19 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 19 January 2014 at 14:23:44 UTC, Paolo Invernizzi 
wrote:
 However, in a compiler you can defer to "the pilot" (compiler) 
 so that is generally easier. In a server you can't.
I'm trying to understand your motivations, but why in a server you can't? I still can't grasp that point.
It was a typo: However, in a compiler you can defer to "the pilot" (programmer) so that is generally easier. In a server you can't.
Jan 19 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/19/2014 4:20 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 If you take a theoretical position (which I think you do)
Please set such statements against my assertion that Boeing follows the principles I outlined in practice, and very successfully.
Jan 19 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 19 January 2014 at 19:55:34 UTC, Walter Bright wrote:
 On 1/19/2014 4:20 AM, "Ola Fosheim Grøstad" 
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 If you take a theoretical position (which I think you do)
Please set such statements against my assertion that Boeing follows the principles I outlined in practice, and very successfully.
Would they even consider using D? Are you referring to a niche for which D is not suited? In most situations you cannot afford to develop a program to the point where it is 100% bug free, let alone develop 2 independent versions of the spec to 100% bug free. This is a very narrow niche. And it isn't even relevant, because crash-and-burn-on-null is still an option with voluntary recovery mechanisms. And as I pointed out, you wouldn't use zero as a null value if safety and debugging is your prime concern. You would use a more "secure" bitpattern (i.e. harder to arrive at by accident), and you would differentiate "null" from "undefined"?
Jan 19 2014
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-01-19 00:34, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:

 I have not argued for not having a null. I have argued for trapping
 null, instantiating a type specific default and recover if the type's
 meta says so. That default could be to issue a NullException. At that
 point you should be able to log null dereferences too.
I think nil (null) works quite nicely in Ruby. nil is a singleton instance of the NilClass class. Since it's an object you can call methods on it, like to_s, which returns an empty string. It works quite well when doing web development with Ruby on Rails. If you're trying to render something that's nil you'll get nothing, instead of crashing that whole page. Sure, there might be a small icon or similar that isn't rendered but that's usually a minor detail. If the page is working and main content is rendered that's preferable. -- /Jacob Carlborg
Jan 19 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Jacob Carlborg:

 I think nil (null) works quite nicely in Ruby. nil is a 
 singleton instance of the NilClass class. Since it's an object 
 you can call methods on it, like to_s, which returns an empty 
 string. It works quite well when doing web development with 
 Ruby on Rails. If you're trying to render something that's nil 
 you'll get nothing, instead of crashing that whole page. Sure, 
 there might be a small icon or similar that isn't rendered but 
 that's usually a minor detail. If the page is working and main 
 content is rendered that's preferable.
Walter is arguing against this solution in D. While that can be OK for Ruby used for web development, for a statically typed language meant for safe coding styles, there are better type-based solution to that problem. Bye, bearophile
Jan 19 2014
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-19 11:30:00 +0000, "bearophile" <bearophileHUGS lycos.com> said:

 Jacob Carlborg:
 
 I think nil (null) works quite nicely in Ruby. nil is a singleton 
 instance of the NilClass class. Since it's an object you can call 
 methods on it, like to_s, which returns an empty string. It works quite 
 well when doing web development with Ruby on Rails. If you're trying to 
 render something that's nil you'll get nothing, instead of crashing 
 that whole page. Sure, there might be a small icon or similar that 
 isn't rendered but that's usually a minor detail. If the page is 
 working and main content is rendered that's preferable.
Walter is arguing against this solution in D. While that can be OK for Ruby used for web development, for a statically typed language meant for safe coding styles, there are better type-based solution to that problem.
It won't work in D's type system anyway. Ruby is dynamically typed, that's why it can work. Interestingly, in Objective-C calling a method on a null object pointer just does nothing. That's a feature that is often useful when chaining calls that might return null (you don't have to do all these extra checks) or with weak pointers, but it can on occasion lead to subtle bugs. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 19 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 19 January 2014 at 10:32:36 UTC, Jacob Carlborg wrote:
 I think nil (null) works quite nicely in Ruby. nil is a 
 singleton instance of the NilClass class. Since it's an object 
 you can call methods on it, like to_s, which returns an empty 
 string. It works quite well when doing web development with
I have no experience with Ruby, but Javascript also do this (undefined is an object). I don't think it is more work to debug Python and Javascript null exceptions than C-like code. I think it was a mistake to let zero represent null, and it was probably done to make testing faster. But a different bit-pattern could prevent conflation between memory corruption and null. It is highly improbable that an address like $F1234324 is the result of memory corruption. Another advantage with null-objects as a pattern and mechanisms to support it is that you can have multiple variants and differentiate between: - undefined (not initialized) - null (deliberately not having an attribute) - application specific null values (like, try-again, lazy-evaluation) etc that depending on context evaluates to undefined, null or tries to fetch the values by computation. - ++
Jan 19 2014
parent reply "Paolo Invernizzi" <paolo.invernizzi gmail.com> writes:
On Sunday, 19 January 2014 at 12:29:14 UTC, Ola Fosheim Grøstad 
wrote:
 On Sunday, 19 January 2014 at 10:32:36 UTC, Jacob Carlborg 
 wrote:

 I have no experience with Ruby, but Javascript also do this 
 (undefined is an object). I don't think it is more work to 
 debug Python and Javascript null exceptions than C-like code.
Having had heavy experience with Python, and having used D since D1, I would tell the contrary, that's easier to handle null exception with D. That's only experience based, naturally, I don't pretend this to be science. Going down the hill, we should then love PHP, as it's mantra seems to be "go marching in", and we all know what an horrible mess it is. --- Paolo
Jan 19 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 19 January 2014 at 14:14:11 UTC, Paolo Invernizzi 
wrote:
 Having had heavy experience with Python, and having used D 
 since D1, I would tell the contrary, that's easier to handle 
 null exception with D.
Leaving dynamic vs non-dynamic aspects aside, why do you find null dereferencing easier to handle (i.e. debug) in D than in Python?
Jan 19 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 4:44 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 Big systems have to live with bugs, it is inevitable that they run with bugs.
It's a dark and stormy night. You're in a 747 on final approach, flying on autopilot. Scenario 1 ---------- The autopilot software was designed by someone who thought it should keep operating even if it detects faults in the software. The software runs into a null object when there shouldn't be one, and starts feeding bad values to the controls. The plane flips over and crashes, everybody dies. But hey, the software kept on truckin'! Scenario 2 ---------- The autopilot software was designed by Boeing. Actually, there are two autopilots, each independently developed, with different CPUs, different hardware, different algorithms, different languages, etc. One has a null pointer fault. A deadman circuit sees this, and automatically shuts that autopilot down. The other autopilot immediately takes over. The pilot is informed that one of the autopilots failed, and the pilot immediately shuts off the remaining autopilot and lands manually. The passengers all get to go home. Note that in both scenarios there are bugs in the software. Yes there have been incidents with earlier autopilots where bugs in it caused the airplane to go inverted. Consider also the Toyota. My understanding from reading reports (admittedly journalists botch up the facts) is that a single computer controls the brakes, engine, throttle, ignition switch, etc. Oh joy. I wouldn't want to be in that car when it keeps on going despite having self-detected faults. It could, you know, start going at full throttle and ignore all signals to brake or turn off, only stopping when it crashes or runs out of gas.
Jan 17 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 18 January 2014 at 01:46:55 UTC, Walter Bright wrote:
 The autopilot software was designed by someone who thought it 
 should keep operating even if it detects faults in the software.
I would not write autopilot or life-support software in D. So that is kind of out-of-scope for the language. But: Keep the system simple, select a high level language and verify correctness by an automated proof system. Use 3 independently implemented systems and shut down the one that produces deviant values. That covers more ground than the unlikely null-pointers in critical systems. No need to self-detect anything.
 Consider also the Toyota. My understanding from reading reports 
 (admittedly journalists botch up the facts) is that a single 
 computer controls the brakes, engine, throttle, ignition 
 switch, etc. Oh joy. I wouldn't want to be in that car when it 
 keeps on going despite having self-detected faults.
So you would rather have the car drive off the road because the anti-skid software abruptly turned itself off during an emergency manoeuvre? But would you stay in a car where the driver talks in a cell-phone while driving, or would you tell him to stop? Probably much more dangerous if you measured correlation between accidents and system features. So you demand perfection from a computer, but not from a human being that is exhibiting risk-like behaviour. That's an emotional assessment. The rational action would be to improve the overall safety of the system, rather than optimizing a single part. So spend the money on installing a cell-phone jammer and an accelerator limiter rather than investing in more computers. Clearly, the computer is not the weakest link, the driver is. He might not agree, but he is and he should be forced to exhibit low risk behaviour. Direct effort to where it has most effect. (From a system analytical point of view. It might not be a good sales tactic, because car buyers aren't that rational.)
Jan 17 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Ola Fosheim Grøstad:

 I would not write autopilot or life-support software in D. So 
 that is kind of out-of-scope for the language.
Do you want to use Ada for those purposes? Bye, bearophile
Jan 17 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:22 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Saturday, 18 January 2014 at 01:46:55 UTC, Walter Bright wrote:
 The autopilot software was designed by someone who thought it should keep
 operating even if it detects faults in the software.
I would not write autopilot or life-support software in D. So that is kind of out-of-scope for the language. But: Keep the system simple, select a high level language and verify correctness by an automated proof system. Use 3 independently implemented systems and shut down the one that produces deviant values. That covers more ground than the unlikely null-pointers in critical systems. No need to self-detect anything.
I didn't mention that the dual autopilots also have a comparator on the output, and if they disagree they are both shut down. The deadman is an additional check. The dual system has proven itself, a third is not needed.
 Consider also the Toyota. My understanding from reading reports (admittedly
 journalists botch up the facts) is that a single computer controls the brakes,
 engine, throttle, ignition switch, etc. Oh joy. I wouldn't want to be in that
 car when it keeps on going despite having self-detected faults.
So you would rather have the car drive off the road because the anti-skid software abruptly turned itself off during an emergency manoeuvre?
Please reread what I wrote. I said it shuts itself off and engages the backup, and if there is no backup, you have failed at designing a safe system.
 But would you stay in a car where the driver talks in a cell-phone while
 driving, or would you tell him to stop? Probably much more dangerous if you
 measured correlation between accidents and system features. So you demand
 perfection from a computer, but not from a human being that is exhibiting
 risk-like behaviour. That's an emotional assessment.

 The rational action would be to improve the overall safety of the system,
rather
 than optimizing a single part. So spend the money on installing a cell-phone
 jammer and an accelerator limiter rather than investing in more computers.
 Clearly, the computer is not the weakest link, the driver is. He might not
 agree, but he is and he should be forced to exhibit low risk behaviour. Direct
 effort to where it has most effect.

 (From a system analytical point of view. It might not be a good sales tactic,
 because car buyers aren't that rational.)
I have experience with this stuff, Ola, from my years at Boeing designing flight critical systems. What I outlined is neither irrational nor emotionally driven, and has the safety record to prove its effectiveness. I also ask that you please reread what I wrote - I explicitly do not demand perfection from a computer.
Jan 17 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 18 January 2014 at 02:48:38 UTC, Walter Bright wrote:
 I didn't mention that the dual autopilots also have a 
 comparator on the output, and if they disagree they are both 
 shut down. The deadman is an additional check. The dual system 
 has proven itself, a third is not needed.
The pilot is engaged as the third. There are situations where you cannot have a third "intelligent" agent take over, so you should have 3 systems, and reboot and resync the one that diverges, but this is rather off topic. I don't think D is a language that should be used for these kind of systems.
 Please reread what I wrote. I said it shuts itself off and 
 engages the backup, and if there is no backup, you have failed 
 at designing a safe system.
A car driver that is doing an emergency manoeuvre is not part of a safe system, indeed! If you want one system to take over for another you need a safe spot to do it in. Just disappearing instantly isn't optimal because instantly changing responsiveness is a gurantee for failure. In fact, being instantly disruptive is usually the wrong thing to do. You should spin down gracefully. I don't see why you cannot do that with null-pointers. You obviously can do it with division by zero errors. I think you associate null-pointers with memory corruption, which truly is an invalid state for which you might want to instantly shut down.
 I have experience with this stuff, Ola, from my years at Boeing 
 designing flight critical systems. What I outlined is neither 
 irrational nor emotionally driven, and has the safety record to 
 prove its effectiveness.
In a very narrow field where the pilot is monitoring the system and can take over. The pilot is the ultimate source for failure (in a political sense). So you basically shut down the technology and blame the pilot if you end up with a crash. That only works if the computer has been made to replace a human being.
Jan 17 2014
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Jan 18, 2014 at 02:22:22AM +0000, digitalmars-d-bounces puremagic.com
wrote:
 On Saturday, 18 January 2014 at 01:46:55 UTC, Walter Bright wrote:
[...]
Consider also the Toyota. My understanding from reading reports
(admittedly journalists botch up the facts) is that a single
computer controls the brakes, engine, throttle, ignition switch,
etc. Oh joy. I wouldn't want to be in that car when it keeps on
going despite having self-detected faults.
So you would rather have the car drive off the road because the anti-skid software abruptly turned itself off during an emergency manoeuvre?
[...] You missed his point. The complaint is that the car has a *single* software system that handles everything. That's a single point of failure. When that single software system fails, *everything* fails. A fault-tolerant design demands at least two anti-skid software units, where the redundant unit will kick in when the primary one turns off or stops for whatever reason. So when a software fault occurs in the primary unit, it gets shut off, and the backup unit takes over and keeps the car stable. You'd only crash in the event that *both* units fail at the same time, which is far less likely than a single unit failing. This is better than having a single software system that tries to fix itself when it goes wrong, because the fact that something caused part of the code to crash (segfault, or whatever) is a sign that the system is no longer in a state anticipated by the engineers, so there's no guarantee it won't make things worse when it tries to fix itself. For example, it might be scrambled into a state where it keeps the accelerator on with no way to override it, thereby making the problem worse. You need a *decoupled* redundant system to be truly confident that whatever fault caused the problem in the first system doesn't also affect the backup / self-repair system, something which doesn't hold for a single software unit (for example, if the power supply to the unit fails, then whatever self-repair subsystem it has will also be non-functional). That way, when the first unit goes wrong, it can simply be shut off safely, thereby preventing making the problem worse, and the backup unit takes over and keeps things going. To use a software example: if you have a single process that tries to fix itself when, say, a null pointer is dereferenced, then there's no guarantee that the error recovery code won't do something stupid, like format your disk (because the null pointer in an unexpected place proves that the code has logic problems: it isn't in a state that the engineers planned for, so who knows what else is wrong with it -- maybe a function pointer to display graphics has been accidentally replaced with a pointer to the formatDisk function due to the bug that caused the null to appear in an unexpected place). If instead you have two redundant processes, one of which is doing the real work and the second is just sleeping, then when the first process segfaults due to a null pointer, the second one can kick into action -- since it hasn't been doing the same thing as the first process, it's likely still in a safe, consistent state, and so it can safely take over and keep the service running. This is the premise of high-availability systems: there's a primary server that's doing the real work, and one or more redundant units. When the primary dies (power loss, CPU overheat, segfault causing it no longer to respond, etc.), a watchdog timer triggers a failover to the second unit, thus minimizing service interruption time. The failover detection code can then contact an administrator (email, SMS, etc.) notifying that something went wrong with the first unit, and service continues uninterrupted while the first unit is repaired. OTOH, if you have only a single unit and something goes wrong, there's a risk that the recovery code will go wrong too, so the entire unit stops functioning, and service is interrupted until it's repaired. T -- Study gravitation, it's a field with a lot of potential.
Jan 17 2014
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 7:05 PM, H. S. Teoh wrote:
 [...]
Thank you, a good explanation. I don't know how the anti-skid brake system is designed. But on the older systems, brakes are a dual mostly-independent system. There was one system for the front brakes, and another for the rear. Dual cylinders, dual reservoirs, etc. The brake pedal operated both cylinders. There was even a hydraulic comparator between the two, which would turn on a red [brake] light on the dash if they differed in pressure. Last year, I acquired a leak in my rear brakes on my old truck, and that light coming on was my first indication of trouble. The front brakes still worked fine, and topping off the rear reservoir got the rear brakes temporarily working, and I was able to ease it to the repair shop without difficulty. It's a good example of how to build a safe, fault tolerant system.
Jan 17 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 18 January 2014 at 03:07:30 UTC, H. S. Teoh wrote:
 You missed his point. The complaint is that the car has a 
 *single*
 software system that handles everything. That's a single point 
 of
 failure. When that single software system fails, *everything* 
 fails.
I didn't miss the point at all. My point is that you should always target the cost of improving the statistical overall safety of the system rather than optimizing the stability of a single part that almost never fail. Having multiple independent software implementations only works for very simple systems. And in that case you can prove correctness by formal proofs. It is more likely to fail due to a loose wire or electrical components.
Jan 18 2014
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 18 January 2014 at 01:46:55 UTC, Walter Bright wrote:
 On 1/17/2014 4:44 PM, "Ola Fosheim Grøstad" 
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 Big systems have to live with bugs, it is inevitable that they 
 run with bugs.
It's a dark and stormy night. You're in a 747 on final approach, flying on autopilot. Scenario 1 ---------- The autopilot software was designed by someone who thought it should keep operating even if it detects faults in the software. The software runs into a null object when there shouldn't be one, and starts feeding bad values to the controls. The plane flips over and crashes, everybody dies. But hey, the software kept on truckin'! Scenario 2 ---------- The autopilot software was designed by Boeing. Actually, there are two autopilots, each independently developed, with different CPUs, different hardware, different algorithms, different languages, etc. One has a null pointer fault. A deadman circuit sees this, and automatically shuts that autopilot down. The other autopilot immediately takes over. The pilot is informed that one of the autopilots failed, and the pilot immediately shuts off the remaining autopilot and lands manually. The passengers all get to go home. Note that in both scenarios there are bugs in the software. Yes there have been incidents with earlier autopilots where bugs in it caused the airplane to go inverted. Consider also the Toyota. My understanding from reading reports (admittedly journalists botch up the facts) is that a single computer controls the brakes, engine, throttle, ignition switch, etc. Oh joy. I wouldn't want to be in that car when it keeps on going despite having self-detected faults. It could, you know, start going at full throttle and ignore all signals to brake or turn off, only stopping when it crashes or runs out of gas.
You are running a huge website. Let's say for instance a social network with more than a billion users. Scenario 1 ---------- The software was designed by someone who thought it should keep operating even if it detects faults in the software. A bug arise in some fronted and it start corruption data. Some monitoring detects the issue, the code get fixed and corrupted data are recovered from backup. Users that ended up on that cluster saw their account not working for a day, but everything is back to normal the day after. Scenario 2 ---------- The software was designed by an ex employee from boeing. He know that he should make his software crash hard and soon. As soon as some error are detected on a cluster, the cluster goes down. Hopefully, no data is corrupted, but the load on that cluster must now be handled by other cluster. Soon enough, these clusters overload and the whole website goes down. Hopefully, no data were corrupted in the process, so it isn't needed to restore anything from backup. Different software, different needs. Ultimately, that distinction is irrelevant anyway. The whole possibility of these scenarios can be avoided in case of null dereferences by proper language design.
Jan 17 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:40 PM, deadalnix wrote:
 Different software, different needs. Ultimately, that distinction
 is irrelevant anyway. The whole possibility of these scenarios
 can be avoided in case of null dereferences by proper language
 design.
I've already agreed that detecting bugs at compile time is better.
Jan 17 2014
prev sibling parent "Paolo Invernizzi" <paolo.invernizzi gmail.com> writes:
On Saturday, 18 January 2014 at 00:44:56 UTC, Ola Fosheim Grøstad 
wrote:
 Big systems have to live with bugs, it is inevitable that they 
 run with bugs. Erlang was created for stuff like that. Can you 
 imagine a telephone central going down just because the 
 settings for one subscriber was wrong and triggered a bug?
Stupid me, that I thought that all this mess was solved decades ago with the CPU protected mode. I tend to agree with Walter on this front. -- Paolo
Jan 18 2014
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 17, 2014 at 11:43:59AM -0800, Walter Bright wrote:
[...]
 I've almost never had a problem tracking down the cause of a null
 pointer. Usually just a few minutes with a debugger and getting a
 backtrace.
I think this depends on the kind of code you write. In callback-heavy code, usually when you're multiplexing between many simultaneous request/response chains, these kinds of problems are very hard to track down. You'll see the null pointer somewhere in your callback's context structure, but no amount of backtrace will help you go any further because they all end at the event dispatch loop pretty shortly up the stack, which doesn't tell you where in the chain of events the null came from. The callback could've been invoked from any number of places (usually callbacks are factored into generic functions so that you don't have to deal with writing 500 callbacks just to handle the response to each event type), so you have to deduce which of the n number of possibilities may have caused it. Usually, that only leads to finding that the previous event handler was only passing things along, so you have to go back yet another step in the chain. When there are n possible ancestors for each step in the chain, you're talking about n^k possible paths to investigate before you find the ultimate culprit. A few minutes is not going to suffice, even for moderate values of k. If you're very very lucky, you may have a log of events that help narrow k to a sufficiently small number that allows quick deduction of where the problem is. Often, though, all you have is a stack trace from an inaccessible customer production server, and you'll have to explore n^k possibilities before you can find the problem. (Even a stack trace should already be counted as lucky; I've had to fix *race conditions* with no stack trace and only indirect evidence that a daemon died and got restarted by init. It took months before we could even reproduce the problem -- it was highly dependent on precise network timing -- much less guess at what went wrong.) Nipping the null at the bud (i.e., know where in the code it was first set as null) is a real life-saver in these kinds of situations. T -- Кто везде - тот нигде.
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 1:10 PM, H. S. Teoh wrote:
 On Fri, Jan 17, 2014 at 11:43:59AM -0800, Walter Bright wrote:
 [...]
 I've almost never had a problem tracking down the cause of a null
 pointer. Usually just a few minutes with a debugger and getting a
 backtrace.
I think this depends on the kind of code you write. In callback-heavy code, usually when you're multiplexing between many simultaneous request/response chains, these kinds of problems are very hard to track down. You'll see the null pointer somewhere in your callback's context structure, but no amount of backtrace will help you go any further because they all end at the event dispatch loop pretty shortly up the stack, which doesn't tell you where in the chain of events the null came from.
What you do then is go back as far as practical, then put asserts in.
 When there are n possible ancestors for each step in the chain, you're
 talking about n^k possible paths to investigate before you find the
 ultimate culprit. A few minutes is not going to suffice, even for
 moderate values of k.
Put asserts in, maybe a printf too. I do it all the time to track down problems. Memory corruption can be a difficult one to track down, but I've never had trouble tracking down a null pointer's source. They aren't any harder than any "I have a bad value in this field, I wonder who set it" problem. I.e. it isn't special.
 Nipping the null at the bud (i.e., know where in the code it was first
 set as null) is a real life-saver in these kinds of situations.
It's better for any bug to catch it at compile time.
Jan 17 2014
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 17, 2014 at 01:29:53PM -0800, Walter Bright wrote:
 On 1/17/2014 1:10 PM, H. S. Teoh wrote:
On Fri, Jan 17, 2014 at 11:43:59AM -0800, Walter Bright wrote:
[...]
I've almost never had a problem tracking down the cause of a null
pointer. Usually just a few minutes with a debugger and getting a
backtrace.
I think this depends on the kind of code you write. In callback-heavy code, usually when you're multiplexing between many simultaneous request/response chains, these kinds of problems are very hard to track down. You'll see the null pointer somewhere in your callback's context structure, but no amount of backtrace will help you go any further because they all end at the event dispatch loop pretty shortly up the stack, which doesn't tell you where in the chain of events the null came from.
What you do then is go back as far as practical, then put asserts in.
Of course, but that requires a few rounds of recompilation, repacking the EEPROM image, installation on device, and repeating a user interaction that possibly only randomly hits the bug. Hardly the work of a few minutes.
When there are n possible ancestors for each step in the chain,
you're talking about n^k possible paths to investigate before you
find the ultimate culprit. A few minutes is not going to suffice,
even for moderate values of k.
Put asserts in, maybe a printf too. I do it all the time to track down problems. Memory corruption can be a difficult one to track down, but I've never had trouble tracking down a null pointer's source. They aren't any harder than any "I have a bad value in this field, I wonder who set it" problem. I.e. it isn't special.
True. But in many cases, the source of a bad value is easy to find because of its uniqueness (hmm, why does variable x have a value of 24576? oh I know, search for 24576 in the code). Nulls, though, can occur in many places, and in D they are implicit. There's no easy way to search for that.
Nipping the null at the bud (i.e., know where in the code it was
first set as null) is a real life-saver in these kinds of situations.
It's better for any bug to catch it at compile time.
[...] Currently D doesn't allow catching nulls at compile-time. T -- Give me some fresh salted fish, please.
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 2:05 PM, H. S. Teoh wrote:
 Of course, but that requires a few rounds of recompilation, repacking
 the EEPROM image, installation on device, and repeating a user
 interaction that possibly only randomly hits the bug. Hardly the work of
 a few minutes.
I've done EEPROM programming & debugging, I know how to do it. Nulls still aren't a special kind of bug. BTW, even back in the 1970's, people developing EEPROM systems would plug a special device into the board which would redirect the EEPROM access to some off-board RAM, or would debug using emulators. Maybe this technology has been forgotten :-)
 But in many cases, the source of a bad value is easy to find because of
 its uniqueness (hmm, why does variable x have a value of 24576? oh I
 know, search for 24576 in the code). Nulls, though, can occur in many
 places, and in D they are implicit. There's no easy way to search for
 that.
Come on. Every type in D has a default initializer. There's still nothing special about null. Even if you got rid of all the nulls and instead use the null object pattern, you're not going to find it any easier to track it down, and you're in even worse shape because now it can fail and you may not even detect the failure, or may discover the error much, much further from the source of the bug.
Jan 17 2014
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 17, 2014 at 03:37:22PM -0800, Walter Bright wrote:
 On 1/17/2014 2:05 PM, H. S. Teoh wrote:
Of course, but that requires a few rounds of recompilation, repacking
the EEPROM image, installation on device, and repeating a user
interaction that possibly only randomly hits the bug. Hardly the work
of a few minutes.
I've done EEPROM programming & debugging, I know how to do it. Nulls still aren't a special kind of bug.
I wasn't arguing it is. I'm just saying it's not as trivial as you made it sound.
 BTW, even back in the 1970's, people developing EEPROM systems would
 plug a special device into the board which would redirect the EEPROM
 access to some off-board RAM, or would debug using emulators. Maybe
 this technology has been forgotten :-)
Well, nowadays people use VMware, which is essentially a glorified emulator. :) There are times when you have to test on the actual device, though, when emulation is imperfect.
But in many cases, the source of a bad value is easy to find because
of its uniqueness (hmm, why does variable x have a value of 24576? oh
I know, search for 24576 in the code). Nulls, though, can occur in
many places, and in D they are implicit. There's no easy way to
search for that.
Come on. Every type in D has a default initializer. There's still nothing special about null. Even if you got rid of all the nulls and instead use the null object pattern, you're not going to find it any easier to track it down, and you're in even worse shape because now it can fail and you may not even detect the failure, or may discover the error much, much further from the source of the bug.
If the null object remembers who created it (file + line), this would short-circuit a lot of time-consuming detective work you'd have to do otherwise. That was the main thrust behind my original post. I didn't really find the null object pattern, as it is generally used (dummy values for fields, no-ops for methods), particularly useful as a programming technique. It is, after all, just the OO version of a sentinel, and sentinels have been around long before OO was. But if the null object can be made smarter -- remember which piece of code created it, for example -- then it might begin to be useful. Otherwise, I don't see much value in Andrei's proposal. T -- We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true. -- Robert Wilensk
Jan 17 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 3:37 PM, Walter Bright wrote:
 Come on. Every type in D has a default initializer. There's still
 nothing special about null.
There is. Null pointers cause the process to finish.
 Even if you got rid of all the nulls and instead use the null object
 pattern, you're not going to find it any easier to track it down, and
 you're in even worse shape because now it can fail and you may not even
 detect the failure, or may discover the error much, much further from
 the source of the bug.
How do you know all that? Andrei
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 4:17 PM, Andrei Alexandrescu wrote:
 On 1/17/14 3:37 PM, Walter Bright wrote:
 Come on. Every type in D has a default initializer. There's still
 nothing special about null.
There is. Null pointers cause the process to finish.
This particular subthread is about finding the reason why there was a null, not about what happens when the invalid value is detected.
 Even if you got rid of all the nulls and instead use the null object
 pattern, you're not going to find it any easier to track it down, and
 you're in even worse shape because now it can fail and you may not even
 detect the failure, or may discover the error much, much further from
 the source of the bug.
How do you know all that?
Because I've tracked down the cause of many, many null pointers, and I've tracked down the cause of many, many other kinds of invalid values in a variable. Null pointers tend to get detected much sooner, hence closer to where they were set.
Jan 17 2014
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 18 January 2014 at 01:05:50 UTC, Walter Bright wrote:
 Because I've tracked down the cause of many, many null 
 pointers, and I've tracked down the cause of many, many other 
 kinds of invalid values in a variable. Null pointers tend to 
 get detected much sooner, hence closer to where they were set.
To be fair, I do think we should require explicit initialization for char for instance instead of putting in them an invalid codepoint. Still, the consequences for null are much more dramatic, which make it more worthwhile to change. They don't have the same ROI as they don't have the same cost.
Jan 17 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 18 January 2014 at 01:05:50 UTC, Walter Bright wrote:
 Because I've tracked down the cause of many, many null 
 pointers, and I've tracked down the cause of many, many other 
 kinds of invalid values in a variable. Null pointers tend to 
 get detected much sooner, hence closer to where they were set.
Some null pointers. The ones that were set in the current call-chain, but not the long-lived ones. Anyway, pointers are special since they link together subsystems. Yes, it is better to have a null-value than a pointer to a random location, so "null" isn't the worst value for a pointer. It is the best arbitrary wrong value, but there is nothing wrong with saying "Hey, when we are wrong about User identity, we want to keep the system running assuming a Guest identity". However C-like null values conflate a wide array of semantics, I think you can have near half a dozen types of null values in a database. In c-like languages null can mean: not-yet-initialized (used too early), failed-computation (bottom), entity-shall-not-have-subentity (deliberate lack of), empty-container (list is empty), service-not-ready (try again), etc. So yeah, it would be nice if the language is able to distinguish between the different semantics of "null" of different pointer types and convert them into something sensible when "cast" to a different type. For some pointer types it might make sense to kill the thread, for others it would make sense to throw an exception, for others it might make sense to allocate an empty object, for others it might make sense to use a dummy/sentinel. It isn't universally true that the best option is to core dump, but you could do both. You could fork(), core-dump the fork, and recover the original.
Jan 17 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 5:05 PM, Walter Bright wrote:
 On 1/17/2014 4:17 PM, Andrei Alexandrescu wrote:
 Even if you got rid of all the nulls and instead use the null object
 pattern, you're not going to find it any easier to track it down, and
 you're in even worse shape because now it can fail and you may not even
 detect the failure, or may discover the error much, much further from
 the source of the bug.
How do you know all that?
Because I've tracked down the cause of many, many null pointers, and I've tracked down the cause of many, many other kinds of invalid values in a variable. Null pointers tend to get detected much sooner, hence closer to where they were set.
I'm not sure at all. Don't forget you have worked on a category of programs for the last 15 years. Reactive/callback-based code can be quite different, as a poster noted. I'll check with the folks to confirm that. The larger point I'm trying to make is your position and mine requires we need to un-bias ourselves as much as we can. Andrei
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:48 PM, Andrei Alexandrescu wrote:
 On 1/17/14 5:05 PM, Walter Bright wrote:
 On 1/17/2014 4:17 PM, Andrei Alexandrescu wrote:
 Even if you got rid of all the nulls and instead use the null object
 pattern, you're not going to find it any easier to track it down, and
 you're in even worse shape because now it can fail and you may not even
 detect the failure, or may discover the error much, much further from
 the source of the bug.
How do you know all that?
Because I've tracked down the cause of many, many null pointers, and I've tracked down the cause of many, many other kinds of invalid values in a variable. Null pointers tend to get detected much sooner, hence closer to where they were set.
I'm not sure at all. Don't forget you have worked on a category of programs for the last 15 years. Reactive/callback-based code can be quite different, as a poster noted. I'll check with the folks to confirm that.
We've talked about the floating point issue, where nan values "poison" the results rather than raise a seg fault. I remembered later that a while back Don tried to change D to throw an exception when a floating point nan was encountered, specifically because this made it easier for him to track down the source of the nan. He said it was pretty hard to backtrack it from the eventual output. In any case, I'm open to data that supports the notion that delayed detection of an error makes the source of the error easier to identify. That's not been my experience, and my intuition also finds that improbable.
 The larger point I'm trying to make is your position and mine requires we need
 to un-bias ourselves as much as we can.
I agree with that. So I await data :-)
Jan 17 2014
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Jan 17, 2014 at 07:11:16PM -0800, Walter Bright wrote:
[...]
 We've talked about the floating point issue, where nan values
 "poison" the results rather than raise a seg fault. I remembered
 later that a while back Don tried to change D to throw an exception
 when a floating point nan was encountered, specifically because this
 made it easier for him to track down the source of the nan. He said
 it was pretty hard to backtrack it from the eventual output.
[...] This is tangential to the discussion, but the nan issue got me thinking about how one might detect the source of a nan. One crude way might be to use a pass-thru function that asserts if its argument is a nan: real assumeNotNan(real v, string file=__FILE__, size_t line=__LINE__) { assert(!isNan(v), "assumeNotNan failed at " ~ file ~ " line " ~ to!string(line)); return v; } Then you can stick this inside a floating-point expression in the same way you'd stick assert(ptr !is null) in some problematic code in order to find where the null is coming from, to narrow down the source of the nan: real complicatedComputation(real x, real y, real z) { // Original code: //return sqrt(x) * y - z/(x-y) + suspiciousFunc(x,y,z); // Instrumented code (UFCS rawkz!): return sqrt(x).assumeNotNan * y - (z/(x-y)).assumeNotNan + suspiciousFunc(x,y,z).assumeNotNan; } Makes it *slightly* easier than having to break up a complex expression and introduce temporaries in order to insert asserts between terms. But, granted, not by much. Still, it reduces the pain somewhat. T -- Political correctness: socially-sanctioned hypocrisy.
Jan 17 2014
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/18/2014 12:37 AM, Walter Bright wrote:
 Come on. Every type in D has a default initializer.
This is not true any longer. There is disable this().
Jan 18 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 1:29 PM, Walter Bright wrote:
 On 1/17/2014 1:10 PM, H. S. Teoh wrote:
 On Fri, Jan 17, 2014 at 11:43:59AM -0800, Walter Bright wrote:
 [...]
 I've almost never had a problem tracking down the cause of a null
 pointer. Usually just a few minutes with a debugger and getting a
 backtrace.
I think this depends on the kind of code you write. In callback-heavy code, usually when you're multiplexing between many simultaneous request/response chains, these kinds of problems are very hard to track down. You'll see the null pointer somewhere in your callback's context structure, but no amount of backtrace will help you go any further because they all end at the event dispatch loop pretty shortly up the stack, which doesn't tell you where in the chain of events the null came from.
What you do then is go back as far as practical, then put asserts in.
That hardly helps. The source of the problem is not detecting that it occurs, it's avoiding it in the first place by initializing data appropriately. Andrei
Jan 17 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 11:43 AM, Walter Bright wrote:
 On 1/17/2014 6:06 AM, "Ola Fosheim Grøstad"
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Friday, 17 January 2014 at 01:42:38 UTC, Andrei Alexandrescu wrote:
 One common idiom to replace null pointer exceptions with milder
 reproducible
 errors is the null object pattern, i.e. there is
Usually null failures are hard to track when some function returned a null value as an indication of error when the programmer expected an exception.
I've almost never had a problem tracking down the cause of a null pointer. Usually just a few minutes with a debugger and getting a backtrace.
I think this is bias that is countered by extensive experience of groups at Facebook that have no axe to grind in the matter. This bias makes it all the more difficult to recognize the importance of the problem. Andrei
Jan 17 2014
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-17 01:42:37 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Further avenues are opened by thinking what happens if e.g. init is 
 private or  disable-d.
 
 Thoughts?
Some more thoughts. Couldn't we just add a class declaration attribute that'd say that by default this type is not-null? For instance: // references to A are not-null by default notnull class A {} // references to B are not-null too (inherited) class B : A {} Then you have two types: A // not-null reference to A. A? // nullable reference to A. Use it like this: A a; // error: cannot be null, default initialization not allowed A? a; // default-initialized to null void test1(A a) // param a cannot be null { a.foo(); // fine } void test2(A? a) // param a can be null { a.foo(); // error: param a could be null if (a) a.foo(); // param a cannot be null, allowed } void test3(A? a) // param a can be null { A aa = a; // error: param a could be null, can't init to not-null type if (a) aa = a; // param a cannot be null, allowed } This is basically what everyone would wish for a not-null type to do. The syntax is clean and the control flow forces you to check for null before use. Misuses result in a compile-time error. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 17 2014
next sibling parent reply "Yota" <yotaxp thatGoogleMailThing.com> writes:
On Friday, 17 January 2014 at 15:05:10 UTC, Michel Fortin wrote:
 On 2014-01-17 01:42:37 +0000, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:

 Further avenues are opened by thinking what happens if e.g. 
 init is private or  disable-d.
 
 Thoughts?
Some more thoughts. Couldn't we just add a class declaration attribute that'd say that by default this type is not-null?
I have been hoping D would become not-null-by-default since I discovered the language. Looking at what it has done with other type classes, (shared, immutable, etc) it always seemed like something that would be right up its alley. However in my dream world, it would simply apply to all reference types by default. I'm not a fan of using an attribute for this due to how hidden it is from the programmer. You should be able to determine the nullability of something from looking at the code that uses it. But then again, I feel like there should also be a way to distinguish value types from reference types just by looking at the usage, so what do I know? =p
Jan 17 2014
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-17 17:52:18 +0000, "Yota" <yotaxp thatGoogleMailThing.com> said:

 On Friday, 17 January 2014 at 15:05:10 UTC, Michel Fortin wrote:
 Couldn't we just add a class declaration attribute that'd say that by 
 default this type is not-null?
I have been hoping D would become not-null-by-default since I discovered the language. Looking at what it has done with other type classes, (shared, immutable, etc) it always seemed like something that would be right up its alley. However in my dream world, it would simply apply to all reference types by default.
Well, nothing prevents that from happening if notnull is adopted. As proposed, notnull is designed to keep the language backward compatible: it's opt-in and you don't even have to opt-in all at once. Potential problems the early implementation might have can only burn those who give it a try. One can hope the feature will become so loved that it'll become standard pointer behaviour one day, but that can't happen as long as the feature is theoretical. In the eventuality we decide one day to push not-null behaviour to the rest of the language, notnull will just become an attribute with no effect. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 18 2014
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-01-17 16:05, Michel Fortin wrote:

 This is basically what everyone would wish for a not-null type to do.
 The syntax is clean and the control flow forces you to check for null
 before use. Misuses result in a compile-time error.
Yes, exactly. But the issue is that the compiler needs to be modified for this. -- /Jacob Carlborg
Jan 17 2014
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-17 20:21:36 +0000, Jacob Carlborg <doob me.com> said:

 On 2014-01-17 16:05, Michel Fortin wrote:
 
 This is basically what everyone would wish for a not-null type to do.
 The syntax is clean and the control flow forces you to check for null
 before use. Misuses result in a compile-time error.
Yes, exactly. But the issue is that the compiler needs to be modified for this.
Andrei's post was referring at language/compiler changes too: allowing init to be defined per-class, with a hint about disabling init. I took the hint that modifying the compiler to add support for non-null was in the cards and proposed something more useful and less clunky to use. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 17 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 12:42 PM, Michel Fortin wrote:
 On 2014-01-17 20:21:36 +0000, Jacob Carlborg <doob me.com> said:

 On 2014-01-17 16:05, Michel Fortin wrote:

 This is basically what everyone would wish for a not-null type to do.
 The syntax is clean and the control flow forces you to check for null
 before use. Misuses result in a compile-time error.
Yes, exactly. But the issue is that the compiler needs to be modified for this.
Andrei's post was referring at language/compiler changes too: allowing init to be defined per-class, with a hint about disabling init. I took the hint that modifying the compiler to add support for non-null was in the cards and proposed something more useful and less clunky to use.
Yes, improving the language is in the cards. I have collected enough hard evidence to convince myself that null references are a real and important issue (previously I'd agreed with Walter who is considering it not particularly remarkable). Andrei
Jan 17 2014
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Saturday, 18 January 2014 at 00:12:16 UTC, Andrei Alexandrescu 
wrote:
 Yes, improving the language is in the cards. I have collected 
 enough hard evidence to convince myself that null references 
 are a real and important issue (previously I'd agreed with 
 Walter who is considering it not particularly remarkable).
Have they tried using a NotNull!T yet? I wrote one ages ago for a phobos pull request that didn't really go anywhere, but I'm still keeping my file updated with new ideas (most recently, the if(auto nn = x.checkNull) {} thing): http://arsdnet.net/dcode/notnull.d I really think that is very close to meeting all the requirements here, but tbh I don't use it much in the real world myself so I could be missing a major flaw. Of course, such library solutions don't change the default nullability of references.
Jan 17 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 4:21 PM, Adam D. Ruppe wrote:
 On Saturday, 18 January 2014 at 00:12:16 UTC, Andrei Alexandrescu wrote:
 Yes, improving the language is in the cards. I have collected enough
 hard evidence to convince myself that null references are a real and
 important issue (previously I'd agreed with Walter who is considering
 it not particularly remarkable).
Have they tried using a NotNull!T yet?
Not a D project (Java/Android). However there is work underway on a static analyzer for Java. I cannot give exact stats, but we have gathered extensive data that null pointer exceptions form a major, major problem on Facebook's Android app, and that virtually all can be solved by static analysis. No amount of speculation and hypothesizing will talk me out of that. Andrei
Jan 17 2014
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 18 January 2014 at 00:25:49 UTC, Andrei Alexandrescu
wrote:
 On 1/17/14 4:21 PM, Adam D. Ruppe wrote:
 On Saturday, 18 January 2014 at 00:12:16 UTC, Andrei 
 Alexandrescu wrote:
 Yes, improving the language is in the cards. I have collected 
 enough
 hard evidence to convince myself that null references are a 
 real and
 important issue (previously I'd agreed with Walter who is 
 considering
 it not particularly remarkable).
Have they tried using a NotNull!T yet?
Not a D project (Java/Android). However there is work underway on a static analyzer for Java. I cannot give exact stats, but we have gathered extensive data that null pointer exceptions form a major, major problem on Facebook's Android app, and that virtually all can be solved by static analysis. No amount of speculation and hypothesizing will talk me out of that. Andrei
I remember mentioning that a year or two month ago, and you dismissing it. Happy to see you changed your mind on that.
Jan 17 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 4:25 PM, Andrei Alexandrescu wrote:
 On 1/17/14 4:21 PM, Adam D. Ruppe wrote:
 On Saturday, 18 January 2014 at 00:12:16 UTC, Andrei Alexandrescu wrote:
 Yes, improving the language is in the cards. I have collected enough
 hard evidence to convince myself that null references are a real and
 important issue (previously I'd agreed with Walter who is considering
 it not particularly remarkable).
Have they tried using a NotNull!T yet?
Not a D project (Java/Android). However there is work underway on a static analyzer for Java. I cannot give exact stats, but we have gathered extensive data that null pointer exceptions form a major, major problem on Facebook's Android app, and that virtually all can be solved by static analysis. No amount of speculation and hypothesizing will talk me out of that.
I don't disagree that some form of static analysis that can detect null dereferencing at compile time (i.e. static analysis) is a good thing. I concur that detecting bugs at compile time is better. I don't disagree that for an app like a game, soldiering on after the program is in an invalid state doesn't do any particular harm, and so one can make a good case for using the null object pattern. I do not agree with the notion that null pointers detected at runtime are harder to track down than other kinds of invalid values detected at runtime. I strong, strongly, disagree with the notion that critical systems should soldier on once they have entered an invalid state. Such is absolutely the wrong way to go about making a fault tolerant system. For hard evidence, I submit the safety record of airliners.
Jan 17 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 I strong, strongly, disagree with the notion that critical 
 systems should soldier on once they have entered an invalid 
 state.
The idea is to design the language and its type system (and static analysis tools) to be able to reduce the frequency (or probability) of such invalid states, because many of them are removed while you write the program. Bye, bearophile
Jan 17 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:12 PM, bearophile wrote:
 Walter Bright:

 I strong, strongly, disagree with the notion that critical systems should
 soldier on once they have entered an invalid state.
The idea is to design the language and its type system (and static analysis tools) to be able to reduce the frequency (or probability) of such invalid states, because many of them are removed while you write the program.
Once again, "I don't disagree that some form of static analysis that can detect null dereferencing at compile time (i.e. static analysis) is a good thing. I concur that detecting bugs at compile time is better."
Jan 17 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 18 January 2014 at 01:58:13 UTC, Walter Bright wrote:
 I strong, strongly, disagree with the notion that critical 
 systems should soldier on once they have entered an invalid 
 state. Such is absolutely the wrong way to go about making a 
 fault tolerant system. For hard evidence, I submit the safety 
 record of airliners.
But then you have to define "invalid state", not in terms of language constructs, but in terms of the model the implementation is based on. If your thread only uses thread local memory and safe language features it should be sufficient to spin down that thread and restart that sub-service. That's what fault tolerant operating systems do. In a functional language it is easy, you may keep computing with "bottom" until it disappears. "bottom" OR true => true, "bottom" AND false => false. You might even succeed by having lazy evaluation over functions that would never halt. If you KNOW that a particular type is not going to have any adverse affect if taking a default object rather than null (could probably even prove it in is some cases), it does not produce an "invalid state". It might be an exceptional state, but that does not imply that it is invalid. Some systems are in an pragmatic fuzzy state. Take fuzzy logic as an example (http://www.dmitry-kazakov.de/ada/fuzzy.htm) where you operate with two fluid dimensions: necessity and possibility, representing a space with the extremes: false, true, contradiction and uncertain. There is no "invalid state" if the system is designed to be in a state of "best effort".
Jan 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:42 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 But then you have to define "invalid state",
An unexpected value is an invalid state.
Jan 17 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 18 January 2014 at 02:59:43 UTC, Walter Bright wrote:
 On 1/17/2014 6:42 PM, "Ola Fosheim Grøstad" 
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 But then you have to define "invalid state",
An unexpected value is an invalid state.
It is only an invalid state for a subsystem, if your code is written to handle it, it can contain it and recover (or disable that subsystem). Assuming that you know that it unlikely to be caused by memory corruption. The problem with being rigid on this definition is that most non-trivial programs are constantly in an invalid state and therefore should not be allowed to even start. Basically you should stop making DMD available, it contains bugs, it is constantly in an invalid state vs the published model. State is not only variables. State is code too. (e.g. state machine). What is the essential difference between insisting on stopping a program with bugs and insisting on not starting a program with bugs? There is no difference. Still, most companies ship software with known non-fatal bugs.
Jan 18 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/19/2014 12:01 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Saturday, 18 January 2014 at 02:59:43 UTC, Walter Bright wrote:
 On 1/17/2014 6:42 PM, "Ola Fosheim Grøstad"
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 But then you have to define "invalid state",
An unexpected value is an invalid state.
It is only an invalid state for a subsystem, if your code is written to handle it, it can contain it and recover (or disable that subsystem). Assuming that you know that it unlikely to be caused by memory corruption. ...
This is not a plausible assumption. What you tend to know is that the program is unlikely to fail because otherwise it would not have been shipped, being safety critical. I.e. when it fails, you don't know that it is unlikely to be caused by something. It could be hardware failure, and even a formal correctness proof does not help with that.
 The problem with being rigid on this definition is ...
He is not.
 What is the essential difference between insisting on stopping a program
 with bugs and insisting on not starting a program with bugs? There is no
 difference.
 ...
Irrelevant. He is arguing for stopping the system once it has _become clear_ that the _current execution_ might not deliver the expected results.
Jan 19 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 19 January 2014 at 08:41:23 UTC, Timon Gehr wrote:
 This is not a plausible assumption. What you tend to know is 
 that the program is unlikely to fail because otherwise it would 
 not have been shipped, being safety critical.
I feel that this "safety critical" argument is ONE BIG RED HERRING. It is completely irrelevant to the topic which is whether it is useful to have recovery mechanisms for null-pointers or not. It is, but not if you are forced to use them. Nobody suggests that you should be forced to recover from a null dereference? Nevertheless: If your application is safety critical and not proven correct or have not undergone exhaustive testing (all combinations of input) then it most likely is a complex system which is likely to contain bugs. You can deal with this by partitioning the system into independent subsystems (think functional programming) which you in a domain specific manner control (e.g. you can have multiple algorithms and select the median value or the most conservative estimate, spinning down subsystems reverting to a less optimal state (more resource demanding), running a verifier on the result etc etc.
 I.e. when it fails, you don't know that it is unlikely to be 
 caused by something. It could be hardware failure, and even a 
 formal correctness proof does not help with that.
But hardware failure is not a valid issue when discussing programming language constructs? Of course the system design should account for hardware failure, extreme weather that makes sensors go out of range and a drunken sailor pressing all buttons at once. Not a programming language construct topic.
 Irrelevant. He is arguing for stopping the system once it has 
 _become clear_ that the _current execution_ might not deliver 
 the expected results.
Then you have to DEFINE what you mean by expected results. Which is domain specific, not a programming language construct issue in a general programming language. If the expected result is defined be: having a result is better than no result, then stopping the system is the worst thing you could do. If the expected result controls N different systems then it might be better to fry 1 system and keep N-1 systems running than to fry N systems. That's a domain specific choice the system designer should have the right to make. Sacrifice one leg to save the other limbs. Think about the effect of this: 1 router detects a bug, by the logic in this thread it should then notify all routers running the same software and tell them to shut down immediately. Result: insta-death to entire Internet.
Jan 20 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/20/2014 6:18 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 Think about the effect of this: 1 router detects a bug, by the logic in this
 thread it should then notify all routers running the same software and tell
them
 to shut down immediately. Result: insta-death to entire Internet.
Not only has nobody suggested this, I have explicitly written here otherwise, more than once. I infer you think that my arguments here are based on something I dreamed up in 5 minutes of tip-tapping at my keyboard. They are not. They are what Boeing and the aviation industry use extremely successfully to create incredibly safe airliners, and the track record is there for anyone to see. It's fine if believe you've found a better way. But there's a high bar of existing practice and experience to overcome with a better way, and a need to start from a position of understanding that successful existing practice first.
Jan 20 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 20 January 2014 at 19:27:31 UTC, Walter Bright wrote:
 I infer you think that my arguments here are based on something 
 I dreamed up in 5 minutes of tip-tapping at my keyboard. They 
 are not. They are what Boeing and the aviation industry use 
 extremely successfully to create incredibly safe airliners, and 
 the track record is there for anyone to see.
No, but I think you are conflating narrow domains with a given practice and broader programming application development needs and I wonder what the relevance is to this discussion of having other options than bailing out. I assume that you were making a point that is of relevance to application programmers?? I assume that there is more to this than an anecdote? And… no it is not ok to say that one should accept a buggy program to be in a inconsistent state until the symptoms surface and only then do a headshot, which is the reasoning behind one-buggy-implementation-running-until-null-crash. "you aren't ill until you look ill"? But that is not what Boeing does, is it? Because they use a comparator: "we may be ill, but we are unlikely to show the same symptoms, so if we agree we assume that we are well." Which is a much more acceptable "excuse" (in terms of probability of damage). Why? Because a 0.001% chance of implementation related failure could be reduced to say 0.000000001% (depending on the resolution of the output etc). Ideally safe-D should conceptually give you isolates so that an application can call a third party library that loads a corrupted file and crash on a null-ptr (because that code path has never been run before) and you catch that crash and continue. Yes, the library is buggy and only handles consistent files well, but as an application programmer that is fine. I only want to successfully load non-corrupt files, there is no need to fix that library. Iff the library/language assures that it behaves like it is run as an isolate (no side effects on the rest of the program). Wasting resources on handling corrupt files gracefully is pointless if you can isolate and contain the problem. It is fine if HALT is the default in D, defaults should be conservative. It is not fine if the attitude is that HALT is the best option if the programmer thinks otherwise and anticipates trouble.
Jan 20 2014
parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Monday, 20 January 2014 at 20:01:58 UTC, Ola Fosheim Grøstad 
wrote:
 Ideally safe-D should conceptually give you isolates so that an 
 application can call a third party library that loads a 
 corrupted file and crash on a null-ptr (because that code path 
 has never been run before) and you catch that crash and 
 continue. Yes, the library is buggy and only handles consistent 
 files well, but as an application programmer that is fine.
The point is: for true isolation you'll need another process. If you are aware that it could die: let it be. Just restart it or throw the file away or whatever. So given true isolation hlt on null ptr dereference isn't an issue.
Jan 20 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 20 January 2014 at 20:11:33 UTC, Tobias Pankrath wrote:
 The point is: for true isolation you'll need another process. 
 If you are aware that it could die: let it be. Just restart it 
 or throw the file away or whatever.
That is not an option. I started looking at D in early 2006 because I was looking for a language to create an experimental virtual world server. C++ is out of the question because of all the deficiencies (except for some simulation parts that have to be bug-free) and D could have been a good fit. Forking is too slow and tedious. File loading was just an example. The "isolate" should have read access to global state (measured in gigabytes), but not write access. If you cannot have "probable" isolates in safe D, then it is not suitable for "application level" server designs that undergo evolutionary development. I am not expecting true isolates, but "probable". Meaning: the system is more likely to go down for some other reason than a leaky isolate. Isolates and futures are also very simple and powerful abstractions for getting multi-threading in web services in a trouble free manner.
 So given true isolation hlt on null ptr dereference isn't an 
 issue.
You don't need hardware isolation to do this in a way that works in practice. It should be sufficient to do static analysis and get a list of trouble areas which you can inspect.
Jan 20 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/17/14 5:58 PM, Walter Bright wrote:
 I strong, strongly, disagree with the notion that critical systems
 should soldier on once they have entered an invalid state. Such is
 absolutely the wrong way to go about making a fault tolerant system. For
 hard evidence, I submit the safety record of airliners.
You're arguing against a strawman, and it is honestly not pleasant that you reduced one argument to the other. Andrei
Jan 17 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 6:56 PM, Andrei Alexandrescu wrote:
 On 1/17/14 5:58 PM, Walter Bright wrote:
 I strong, strongly, disagree with the notion that critical systems
 should soldier on once they have entered an invalid state. Such is
 absolutely the wrong way to go about making a fault tolerant system. For
 hard evidence, I submit the safety record of airliners.
You're arguing against a strawman, and it is honestly not pleasant that you reduced one argument to the other.
I believe I have correctly separated the various null issues into 4 separate ones, and have tried very hard not to conflate them.
Jan 17 2014
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2014-01-17 21:42, Michel Fortin wrote:

 Andrei's post was referring at language/compiler changes too: allowing
 init to be defined per-class, with a hint about disabling init. I took
 the hint that modifying the compiler to add support for non-null was in
 the cards and proposed something more useful and less clunky to use.
Hehe, right, I kind of lost track of the discussion. -- /Jacob Carlborg
Jan 18 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/17/2014 7:05 AM, Michel Fortin wrote:
 Some more thoughts.
The postfix ? has parsing issues with ?: expressions. Aside from that, non-null is only one case of a universe of types that are subsets of other types. I'd prefer a more general solution.
Jan 18 2014
next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Saturday, 18 January 2014 at 21:05:02 UTC, Walter Bright wrote:
 On 1/17/2014 7:05 AM, Michel Fortin wrote:
 Some more thoughts.
The postfix ? has parsing issues with ?: expressions. Aside from that, non-null is only one case of a universe of types that are subsets of other types. I'd prefer a more general solution.
Whats about a new storage class/modifier like safe_ref/ disable_null/ disallow_null? ---- class A { } void test( safe_ref A) { } A a; test(a); // error ---- and maybe also: ---- safe_ref A a; // error ----
Jan 18 2014
parent reply "Namespace" <rswhite4 googlemail.com> writes:
bearophile proposes the following syntax:
 T? means T nullable
 T  = means not nullable.
Here: http://d.puremagic.com/issues/show_bug.cgi?id=4571 And here: http://forum.dlang.org/thread/mailman.372.1364547485.4724.digitalmars-d puremagic.com?page=6#post-jlqdndoqeuhixxcmcbar:40forum.dlang.org
Jan 18 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
Namespace:

 bearophile proposes the following syntax:
In this case the semantics matters much more than the syntax :-) Bye, bearophile
Jan 18 2014
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-18 21:05:01 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 1/17/2014 7:05 AM, Michel Fortin wrote:
 Some more thoughts.
The postfix ? has parsing issues with ?: expressions.
I thought that because the left hand side was a type it could be disambiguated. Somewhat like '*' is disambiguated between pointer declaration and multiplication depending on whether the left hand side is a type. int * a; // pointer declaration a * b; // multiplication But I might be wrong, especially since ?: is a trinary operator while * is always binary.
 Aside from that, non-null is only one case of a universe of types that 
 are subsets of other types. I'd prefer a more general solution.
Hum, kind of thing do we want to support in a general solution? The closest thing I can think of is range constrains. Here's an example (invented syntax): void test(int!0..10 a) // param a is in 0..10 range { int b = a; // fine int!0..5 c = a; // error, incompatible ranges int!0..5 d = a/2; // fine, can't escape range if (a < 5) d = a; // fine, 'a' is int$0..5 on this branch } Obviously compiler code for range constrains and non-null pointers would be shared to some extent, as the rules are pretty similar. But what other cases should be covered by a more general solution? Do we want to add a way for users to declare their own custom type modifiers that can do things? That depends on control flow? -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 18 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 1:38 PM, Michel Fortin wrote:
 The closest thing I can think of is range constrains. Here's an example
 (invented syntax):
I don't think a new syntax is required. We already have the template syntax: RangedInt!(0,10) should do it.
Jan 18 2014
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 I don't think a new syntax is required. We already have the 
 template syntax:

    RangedInt!(0,10)

 should do it.
Is this array literal accepted, and can D spot the out-of-range bug at compile time (Ada language allows both things)? RangedInt!(0, 10)[] arr = [1, 5, 12, 3, 2]; Probably there are other semantic details that should be handled. Bye, bearophile
Jan 18 2014
parent "Nicolas Sicard" <dransic gmail.com> writes:
On Saturday, 18 January 2014 at 22:12:09 UTC, bearophile wrote:
 Walter Bright:

 I don't think a new syntax is required. We already have the 
 template syntax:

   RangedInt!(0,10)

 should do it.
Is this array literal accepted, and can D spot the out-of-range bug at compile time (Ada language allows both things)? RangedInt!(0, 10)[] arr = [1, 5, 12, 3, 2];
Even though the syntax would be less lean, D can already to this with templates and/or CTFE quite easily.
Jan 19 2014
prev sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-18 21:57:21 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 1/18/2014 1:38 PM, Michel Fortin wrote:
 The closest thing I can think of is range constrains. Here's an example
 (invented syntax):
I don't think a new syntax is required. We already have the template syntax: RangedInt!(0,10) should do it.
It works, up to a point. void foo(RangedInt!(0, 5) a); void bar(RangedInt!(0, 10) a) { if (a < 5) foo(a); // what happens here? } In that "foo(a)" line, depending on the implementation of RangedInt you either get a compile-time error that RangedInt!(0, 10) can't be implicitly converted to RangedInt!(0, 5) and have to explicitly convert it, or you get implicit conversion with a runtime check that throws. Just like pointers, not knowing about the actual control flow pushes range constrains enforcement at runtime in situations like this one. It's better than nothing since it'll throw immediately when passing an out of range value to a function and thus the wrong value won't propagate further, but static analysis would make this much better. In fact, even the most obvious case can't be caught at compile-time with the template approach: void baz() { foo(999); // runtime error here } -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 18 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Michel Fortin:

 In fact, even the most obvious case can't be caught at 
 compile-time with the template approach:

 	void baz()
 	{
 		foo(999); // runtime error here
 	}
The constructor of the ranged int needs an "enum precondition": http://forum.dlang.org/thread/ksfwgjqewmsxsribenzq forum.dlang.org Bye, bearophile
Jan 18 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 2:16 PM, Michel Fortin wrote:
 It works, up to a point.

      void foo(RangedInt!(0, 5) a);

      void bar(RangedInt!(0, 10) a)
      {
          if (a < 5)
              foo(a); // what happens here?
      }

 In that "foo(a)" line, depending on the implementation of RangedInt you either
 get a compile-time error that RangedInt!(0, 10) can't be implicitly converted
to
 RangedInt!(0, 5) and have to explicitly convert it, or you get implicit
 conversion with a runtime check that throws.
Yes, and I'm not seeing the problem. (The runtime check may also be removed by the optimizer.)
 Just like pointers, not knowing about the actual control flow pushes range
 constrains enforcement at runtime in situations like this one.
With pointers, the enforcement only happens when converting a pointer to a nonnull pointer.
 In fact, even the most obvious case can't be caught at compile-time with the
 template approach:

      void baz()
      {
          foo(999); // runtime error here
      }
Sure it can. Inlining, etc., and appropriate use of compile time constraints.
Jan 18 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
The point being, there is a whole universe of subset types. You cannot begin to 
predict all the use cases, let alone come up with a special syntax for each.
Jan 18 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 The point being, there is a whole universe of subset types. You 
 cannot begin to predict all the use cases, let alone come up 
 with a special syntax for each.
On the other hand there are some groups of types that are both very common, lead to a good percentage of bugs and troubles, and need a refined type semantic & management to be be handled well (so it's hard to implement them as library types). So there are some solutions: 1) Pick the most common types, like pointers/class references, integral values, and few others, and hard code their handling very well in the language. 2) Try to implement some barely working versions using the existing language features. 3) Add enough tools to the language to allow the creation of "good enough" library defined features. (This is hard to do. Currently you can't implement "good enough" not-nullable reference types or ranged integers in D). 4) Give up and accept to use a simpler language, with a simpler compiler, that is more easy to create and develop. Bye, bearophile
Jan 18 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 1) Pick the most common types, like pointers/class references, 
 integral values, and few others, and hard code their handling 
 very well in the language.
This is what Ada usually does.
 2) Try to implement some barely working versions using the 
 existing language features.
This is what D often has done.
 3) Add enough tools to the language to allow the creation of 
 "good enough" library defined features. (This is hard to do. 
 Currently you can't implement "good enough" not-nullable 
 reference types or ranged integers in D).
This is what some language as ATS tries to do.
 4) Give up and accept to use a simpler language, with a simpler 
 compiler, that is more easy to create and develop.
This is what Go often does (in other cases it hard-codes a solution, like the built-in associative arrays). Bye, bearophile
Jan 18 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 2:43 PM, bearophile wrote:
 Walter Bright:

 The point being, there is a whole universe of subset types. You cannot begin
 to predict all the use cases, let alone come up with a special syntax for each.
On the other hand there are some groups of types that are both very common, lead to a good percentage of bugs and troubles, and need a refined type semantic & management to be be handled well (so it's hard to implement them as library types).
That would be a problem with D's ability to define library types, and we should address that rather than take the Go approach and add more magic to the compiler.
 Currently you can't implement
 "good enough" not-nullable reference types or ranged integers in D).
This is not at all clear.
 4) Give up and accept to use a simpler language, with a simpler compiler, that
 is more easy to create and develop.
Applications are complicated. A simple language tends to push the complexity into the source code. Java is a fine example of that.
Jan 18 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 Currently you can't implement
 "good enough" not-nullable reference types or ranged integers 
 in D).
This is not at all clear.
A good Ranged should allow syntax like this, and it should catch this error at compile time (with an "enum precondition"): Ranged!(int, 0, 10)[] arr = [1, 5, 12, 3, 2]; It also should use the CPU overflow/carry flags to detect efficiently enough integer overflows on a Ranged!(uint, 0, uint.max) type. It should handle the conversions nicely to the super-type and allow the usage of a ranged int as array index. And array bound tests should be disabled if you are using a ranged size_t that is statically known to be in the interval of the array, because this is one of the main purposes of ranged integrals. And D arrays should have optional strongly-typed index types, as in Ada. Because this makes the code safer, easier to reason about, and even faster (thanks to disabling some now unnecessary array bound tests). Similarly not-nullable pointers and class references have some semantic requirements that are not easy to implement in D today. Bye, bearophile
Jan 18 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/18/2014 3:10 PM, bearophile wrote:
 Walter Bright:

 Currently you can't implement
 "good enough" not-nullable reference types or ranged integers in D).
This is not at all clear.
A good Ranged should allow syntax like this, and it should catch this error at compile time (with an "enum precondition"): Ranged!(int, 0, 10)[] arr = [1, 5, 12, 3, 2]; It also should use the CPU overflow/carry flags to detect efficiently enough integer overflows on a Ranged!(uint, 0, uint.max) type. It should handle the conversions nicely to the super-type and allow the usage of a ranged int as array index. And array bound tests should be disabled if you are using a ranged size_t that is statically known to be in the interval of the array, because this is one of the main purposes of ranged integrals. And D arrays should have optional strongly-typed index types, as in Ada. Because this makes the code safer, easier to reason about, and even faster (thanks to disabling some now unnecessary array bound tests).
While these are all desirable features, it is not clear that these cannot be implemented without changing the language - for example, improved optimization can do a lot. And secondly, you said "good enough" not "perfect". Even putting things into the language does not imply they will be perfect, as experience amply shows.
 Similarly not-nullable pointers and class references have some semantic
 requirements that are not easy to implement in D today.
For example?
Jan 18 2014
prev sibling parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-18 22:28:14 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 1/18/2014 2:16 PM, Michel Fortin wrote:
 It works, up to a point.
 
      void foo(RangedInt!(0, 5) a);
 
      void bar(RangedInt!(0, 10) a)
      {
          if (a < 5)
              foo(a); // what happens here?
      }
 
 In that "foo(a)" line, depending on the implementation of RangedInt you either
 get a compile-time error that RangedInt!(0, 10) can't be implicitly 
 converted to
 RangedInt!(0, 5) and have to explicitly convert it, or you get implicit
 conversion with a runtime check that throws.
Yes, and I'm not seeing the problem. (The runtime check may also be removed by the optimizer.)
I'm not concerned about performance, but about catching bugs early. You said it: the compiler (the optimizer) knows (more or less) whether it is possible for you to have an error here (which is why the check might disappear as dead code), but it will let it pass and make the generated code wait until a bad value is passed at runtime to throw something. Couldn't we tell the compiler to emit an error if a function argument "might" be in the offending range instead? That'd be much more useful to find bugs, because you'd find them at compile-time.
 Just like pointers, not knowing about the actual control flow pushes range
 constrains enforcement at runtime in situations like this one.
With pointers, the enforcement only happens when converting a pointer to a nonnull pointer.
True.
 In fact, even the most obvious case can't be caught at compile-time with the
 template approach:
 
      void baz()
      {
          foo(999); // runtime error here
      }
Sure it can. Inlining, etc., and appropriate use of compile time constraints.
Inlining will not allow the error to be caught at compile time, although it might allow the runtime check to be eliminated as dead code. But this is not about performance, it's about detecting this kind of bug early (at compile time). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 18 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 18 January 2014 at 21:05:02 UTC, Walter Bright wrote:
 On 1/17/2014 7:05 AM, Michel Fortin wrote:
 Some more thoughts.
The postfix ? has parsing issues with ?: expressions. Aside from that, non-null is only one case of a universe of types that are subsets of other types. I'd prefer a more general solution.
I was thinking about defining reference types (classes, interfaces) as considered non nullable in safe . It is still possible to encapsulate null in NonNull!T .
Jan 18 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 18 January 2014 at 21:05:02 UTC, Walter Bright wrote:
 On 1/17/2014 7:05 AM, Michel Fortin wrote:
 Some more thoughts.
The postfix ? has parsing issues with ?: expressions. Aside from that, non-null is only one case of a universe of types that are subsets of other types. I'd prefer a more general solution.
I was thinking about defining reference types (classes, interfaces) as considered non nullable in safe . It is still possible to encapsulate null in Nullable!T .
Jan 18 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/18/2014 10:05 PM, Walter Bright wrote:
 On 1/17/2014 7:05 AM, Michel Fortin wrote:
 Some more thoughts.
The postfix ? has parsing issues with ?: expressions. ...
In what sense? It can be unambiguously parsed easily.
 Aside from that, non-null is only one case of a universe of types that
 are subsets of other types.
This is not true. The main rationale for "non-null" is to eliminate null dereferences. A? in his proposal is different from current nullable references in that the compiler does not allow them to be dereferenced. If we just had a solution for arbitrary subset types, we'd _still_ be left with a type of references that might be null, but are not prevented to be dereferenced. Besides, I think it is strange to think of valid references as just being some arbitrary special case of nullable references. A nullable reference is roughly what you get when you put an arbitrary incompatible value 'null' into the set of valid references: {valid references} ∪ {null} All that is asked for is to make the disjunct that is _actually interesting_ it's own type. Sure, it could be described as: {x ∈ ({valid references} ∪ {null}) | x ≠ null} But I think this is a silly way of expressing oneself.
 I'd prefer a more general solution.
Subset types are not more general than the proposed feature.
Jan 18 2014
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-19 07:56:06 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 01/18/2014 10:05 PM, Walter Bright wrote:
 Aside from that, non-null is only one case of a universe of types that
 are subsets of other types.
This is not true. The main rationale for "non-null" is to eliminate null dereferences. A? in his proposal is different from current nullable references in that the compiler does not allow them to be dereferenced.
Actually, 'A?' would implicitly convert to 'A' where the compiler can prove control flow prevents its value from being null. So you can dereference it in a branch that checked for null: class A { int i; void foo(); } void bar(A a); // non-nullable parameter void test(A? a, A? a2) { a.i++; // error, 'a' might be null a.foo(); // error, 'a' might be null bar(a); // error, 'a' might be null if (a) { a.i++; // valid, 'a' can't be null here a.foo(); // valid, 'a' can't be null here bar(a); // valid, 'a' can't be null here } } Obviously, the compiler has to be pessimistic, which means that if your control flow is too complicated you might have to use a cast, or add an extra "if" or assert. Personally, I don't see that as a problem. If I have to choose between dynamic typing and static typing, I'll choose the later, even if some time the type system forces me to do a cast. Same thing here with null.
 If we just had a solution for arbitrary subset types, we'd _still_ be 
 left with a type of references that might be null, but are not 
 prevented to be dereferenced.
And I left out that point while converting the example to numeric ranges earlier. It's an important point. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 19 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/19/2014 01:03 PM, Michel Fortin wrote:
 Actually, 'A?' would implicitly convert to 'A' where the compiler can
 prove control flow prevents its value from being null.
I think the type should be upgraded. i.e.:
 So you can
 dereference it in a branch that checked for null:

      class A { int i; void foo(); }
      void bar(A a); // non-nullable parameter

      void test(A? a, A? a2)
      {
          a.i++; // error, 'a' might be null
          a.foo(); // error, 'a' might be null
          bar(a); // error, 'a' might be null

          if (a)
          {
static assert(is(typeof(a)==A));
              a.i++; // valid [...]
              a.foo(); // valid [...]
              bar(a); // valid [...]
          }
      }
Jan 19 2014
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-01-19 20:07:40 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 01/19/2014 01:03 PM, Michel Fortin wrote:
 Actually, 'A?' would implicitly convert to 'A' where the compiler can
 prove control flow prevents its value from being null.
I think the type should be upgraded. i.e.:
 So you can
 dereference it in a branch that checked for null:
 
      class A { int i; void foo(); }
      void bar(A a); // non-nullable parameter
 
      void test(A? a, A? a2)
      {
          a.i++; // error, 'a' might be null
          a.foo(); // error, 'a' might be null
          bar(a); // error, 'a' might be null
 
          if (a)
          {
static assert(is(typeof(a)==A));
              a.i++; // valid [...]
              a.foo(); // valid [...]
              bar(a); // valid [...]
          }
      }
That's one way to do it. Note that this means you can't assign null to 'a' inside the 'if' branch. But I wouldn't worry too much about that. I think it'd make a good first implementation. What I expect from a not-null feature is that it starts by being over-restrictive and with time, as the control flow analysis evolves, unnecessary restrictions would be lifted. That's similar to how CTFE and purity became what they are today. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Jan 19 2014
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 20 January 2014 at 00:44:21 UTC, Michel Fortin wrote:
 On 2014-01-19 20:07:40 +0000, Timon Gehr <timon.gehr gmx.ch> 
 said:

 On 01/19/2014 01:03 PM, Michel Fortin wrote:
 Actually, 'A?' would implicitly convert to 'A' where the 
 compiler can
 prove control flow prevents its value from being null.
I think the type should be upgraded. i.e.:
 So you can
 dereference it in a branch that checked for null:
 
     class A { int i; void foo(); }
     void bar(A a); // non-nullable parameter
 
     void test(A? a, A? a2)
     {
         a.i++; // error, 'a' might be null
         a.foo(); // error, 'a' might be null
         bar(a); // error, 'a' might be null
 
         if (a)
         {
static assert(is(typeof(a)==A));
             a.i++; // valid [...]
             a.foo(); // valid [...]
             bar(a); // valid [...]
         }
     }
That's one way to do it. Note that this means you can't assign null to 'a' inside the 'if' branch. But I wouldn't worry too much about that. I think it'd make a good first implementation. What I expect from a not-null feature is that it starts by being over-restrictive and with time, as the control flow analysis evolves, unnecessary restrictions would be lifted. That's similar to how CTFE and purity became what they are today.
I don't see the point of intruducing a new syntax for nullable, when D typesystem is already powerful enough to provide it as lib.
Jan 19 2014
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/20/2014 01:44 AM, Michel Fortin wrote:
 That's one way to do it. Note that this means you can't assign null to
 'a' inside the 'if' branch.  ...
Such an assignment could downgrade the type again. An alternative would be to not use flow analysis at all and require eg: A? a=foo(); if(A b=a){ // use b of type 'A' here } A solution that sometimes allows A? to be dereferenced will likely have issues with eg. IFTI.
Jan 20 2014
prev sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 17 Jan 2014 01:45, "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org>
wrote:
 Walter and I were talking today about the null pointer issue and he had
the following idea.
 One common idiom to replace null pointer exceptions with milder
reproducible errors is the null object pattern, i.e. there is one object that is used in lieu of the null reference to initialize all otherwise uninitialized references. In D that would translate naturally to:
 class Widget
 {
     private int x;
     private Widget parent;
     this(int y) { x = y; }
     ...
     // Here's the interesting part
     static Widget init = new Widget(42);
 }

 Currently the last line doesn't compile, but we can make it work if the
respective constructor is callable during compilation. The compiler will allocate a static buffer for the "new"ed Widget object and will make init point there.
 Whenever a Widget is to be default-initialized, it will point to
Widget.init (i.e. it won't be null). This beautifully extends the language because currently (with no init definition) Widget.init is null.
 So the init Widget will satisfy:

 assert(x == 42 && parent is Widget.init);

 Further avenues are opened by thinking what happens if e.g. init is
private or disable-d.
 Thoughts?
I would have thought that sort of code would be doable since someone thought it be a good idea to introduce ClassReferenceExp to the frontend implementation. :)
Jan 17 2014