www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Disallow null references in safe code?

reply "Xinok" <xinok live.com> writes:
I don't know where the community currently stands on non-nullable 
types in D, so this idea may be based on a bit of ignorance. 
Assuming there are some technical issues preventing non-nullable 
types from being implemented, I had a different idea that may be 
somewhat of a compromise. As you've gathered by now, it's simply 
to disallow nullifying references in safe code.

The idea is simply that safe functions can only call other safe 
functions, so null references should be practically non-existant 
... except that's an ideal which can't be reached with this 
restriction alone. There are two obvious issues:

* There's no way to guarantee input is free of null references
* Trusted functions may return objects with null references; it's 
currently not convention to avoid null references in trusted code

Albeit that, I think such a restriction could be helpful in 
preventing bugs/crashes and writing correct code, at least until 
we can get non-nullable types.
Jan 31 2014
next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Saturday, 1 February 2014 at 01:14:07 UTC, Xinok wrote:
 I don't know where the community currently stands on 
 non-nullable types in D, so this idea may be based on a bit of 
 ignorance.

I've written a couple candidates: http://arsdnet.net/dcode/notnull.d and http://arsdnet.net/dcode/notnullsimplified.d The second one is a much simpler implementation that I think covers the same bases. Gotta try it all in real world code though..
Jan 31 2014
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Saturday, February 01, 2014 01:14:05 Xinok wrote:
 I don't know where the community currently stands on non-nullable
 types in D, so this idea may be based on a bit of ignorance.
 Assuming there are some technical issues preventing non-nullable
 types from being implemented, I had a different idea that may be
 somewhat of a compromise. As you've gathered by now, it's simply
 to disallow nullifying references in safe code.
 
 The idea is simply that safe functions can only call other safe
 functions, so null references should be practically non-existant
 ... except that's an ideal which can't be reached with this
 restriction alone. There are two obvious issues:
 
 * There's no way to guarantee input is free of null references
 * Trusted functions may return objects with null references; it's
 currently not convention to avoid null references in trusted code
 
 Albeit that, I think such a restriction could be helpful in
 preventing bugs/crashes and writing correct code, at least until
 we can get non-nullable types.

There's nothing unsafe about null pointers/references. safe is about memory safety, and you can't corrupt memory and otherwise access memory that you're not supposed to with a null pointer or reference. At some point here, we'll have NonNullable (or NotNull whatever it ends up being called) in Phobos so that folks can have non-nullable references/pointers - e.g. NonNullable!Foo. AFAIK, the only real hold-up is someone completely a fully functional implementation. There's been at least one attempt at it, but as I understand it, there were issues that needed to be worked through before it could be accepted. We'll get there though. Regardless, we're not adding anything with regards to non-nullable references to the language itself, and there's nothing unsafe about null references. They're just unpleasant to dereference when your code makes that mistake. - Jonathan M Davis
Jan 31 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/14, 5:39 PM, Jonathan M Davis wrote:
 Regardless, we're not adding anything with regards to non-nullable references
 to the language itself [...]

I think the sea is changing here. I've collected hard data that reveals null pointer dereference is a top problem for at least certain important categories of applications. Upon discussing that data with Walter, it became clear that no amount of belief to the contrary, personal anecdote, or rhetoric can stand against the data. It also became clear that a library solution would improve things but cannot compete with a language solution. The latter can do local flow-sensitive inference and require notations only for function signatures. Consider: class Widget { ... } void fun() { // assume fetchWidget() may return null Widget w = fetchWidget(); if (w) { ... here w is automatically inferred as non-null ... } } Bottom line: a language change for non-null references is on the table. It will be an important focus of 2014. Andrei
Feb 01 2014
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/01/2014 11:05 PM, Adam D. Ruppe wrote:
 A library solution to this exists already:

 Widget wn = fetchWidget();
 if(auto w = wn.checkNull) {
     // w implicitly converts to NotNull!Widget
 }

 I've had some difficulty in const correctness with my implementation...
 but const correct is an entirely separate issue anyway.

 It isn't quite the same as if(w) but meh, does that matter? The point of
 the static check is to make you think about it,  and that's achieved here.

The following illustrates what's not achieved here: if(auto w = wn.checkNull){ // ... }else w.foo();
Feb 01 2014
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/02/2014 04:39 AM, Jonathan M Davis wrote:
 I'm not sure how I feel about that, particularly since I haven't seen such
 data myself. My natural reaction when people complain about null pointer
 problems is that they're sloppy programmers (which isn't necessarily fair, but
 that's my natural reaction).

There is no such thing as 'naturality' that is magically able to justify personal attacks in a technical discussion, even if qualified.
 I pretty much never have problems with null
 pointers and have never understood why so many people complain about them.

I guess it is usually some combination of: Some projects feature more than one programmer. Hard data. Aesthetics. (The situation does not have to be unbearable in order to improve it!)
Feb 02 2014
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/02/2014 01:42 PM, Jonathan M Davis wrote:
 On Sunday, February 02, 2014 12:52:44 Timon Gehr wrote:
 On 02/02/2014 04:39 AM, Jonathan M Davis wrote:
 I'm not sure how I feel about that, particularly since I haven't seen such
 data myself. My natural reaction when people complain about null pointer
 problems is that they're sloppy programmers (which isn't necessarily fair,
 but that's my natural reaction).

There is no such thing as 'naturality' that is magically able to justify personal attacks in a technical discussion, even if qualified.

Would you prefer that I had said "initial reaction" or "gut reaction?"

That's completely missing the point. "No such thing as".
 I'm just saying that that's how I tend to feel when I see complaints about null
 pointers.

Sure. Assuming a basic amount of honesty, people _always_ post their own opinions. To say it as clearly as I can: Please don't feel that way. It is completely unjustified.
 I have never accused anyone of anything  or otherwise attacked them
 because they complained about null pointers.

I have wasted some time trying to figure out the basis of this claim.
 That _would_ be rude.
 ...

We agree on this point.
Feb 02 2014
prev sibling parent reply Nick Treleaven <ntrel-public yahoo.co.uk> writes:
On 01/02/2014 22:05, Adam D. Ruppe wrote:
 On Saturday, 1 February 2014 at 18:58:11 UTC, Andrei Alexandrescu wrote:
     Widget w = fetchWidget();
     if (w)
     {
         ... here w is automatically inferred as non-null ...
     }

A library solution to this exists already: Widget wn = fetchWidget(); if(auto w = wn.checkNull) { // w implicitly converts to NotNull!Widget }

I read your recent post about this, it was interesting. But I don't think you can disallow this: auto cn = checkNull(cast(C)null); NotNull!C nn = cn; obj2 is then null, when it shouldn't be allowed.
Feb 02 2014
parent Nick Treleaven <ntrel-public yahoo.co.uk> writes:
On 02/02/2014 13:18, Nick Treleaven wrote:
      auto cn = checkNull(cast(C)null);
      NotNull!C nn = cn;

 obj2 is then null, when it shouldn't be allowed.

Oops, I meant nn, not obj2.
Feb 02 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/1/14, 2:14 AM, Jonathan M Davis wrote:
 On Saturday, February 01, 2014 04:01:50 deadalnix wrote:
 Dereferencing it is unsafe unless you put runtime check.

How is it unsafe? It will segfault and kill your program, not corrupt memory. It can't even read any memory. It's a bug to dereference a null pointer or reference, but it's not unsafe, because it can't access _any_ memory, let alone memory that it's not supposed to be accessing, which is precisely what safe is all about.

This has been discussed to death a number of times. A field access obj.field will use addressing with a constant offset. If that offset is larger than the lowest address allowed to the application, unsafety may occur. The amount of low-address memory protected is OS-dependent. 4KB can virtually always be counted on. For fields placed beyond than that limit, a runtime test must be inserted. There are few enough 4KB objects out there to make this practically a non-issue. But the checks must be there.
   Which is stupid for something that can be verified at compile time.

In the general case, you can only catch it at compile time if you disallow it completely, which is unnecessarily restrictive. Sure, some basic cases can be caught, but unless the code where the pointer/reference is defined is right next to the code where it's dereferenced, there's no way for the compiler to have any clue whether it's null or not. And yes, there's certainly code where it would make sense to use non-nullable references or pointers, because there's no need for them to be nullable, and having them be non-nullable avoids any risk of forgetting to initialize them, but that doesn't mean that nullable pointers and references aren't useful or that you can catch all instances of a null pointer or reference being dereferenced at compile time.

The Java community has a good experience with Nullable: http://stackoverflow.com/questions/14076296/nullable-annotation-usage Andrei
Feb 01 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/1/2014 12:09 PM, Andrei Alexandrescu wrote:
 On 2/1/14, 2:14 AM, Jonathan M Davis wrote:
 How is it unsafe? It will segfault and kill your program, not corrupt memory.
 It can't even read any memory. It's a bug to dereference a null pointer or
 reference, but it's not unsafe, because it can't access _any_ memory, let
 alone memory that it's not supposed to be accessing, which is precisely what
  safe is all about.

This has been discussed to death a number of times. A field access obj.field will use addressing with a constant offset. If that offset is larger than the lowest address allowed to the application, unsafety may occur. The amount of low-address memory protected is OS-dependent. 4KB can virtually always be counted on. For fields placed beyond than that limit, a runtime test must be inserted. There are few enough 4KB objects out there to make this practically a non-issue. But the checks must be there.

Another way to deal with it is to simply disallow safe objects that are larger than 4K (or whatever the size is on the target system).
Feb 01 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/1/14, 1:11 PM, Walter Bright wrote:
 On 2/1/2014 12:09 PM, Andrei Alexandrescu wrote:
 On 2/1/14, 2:14 AM, Jonathan M Davis wrote:
 How is it unsafe? It will segfault and kill your program, not corrupt
 memory.
 It can't even read any memory. It's a bug to dereference a null
 pointer or
 reference, but it's not unsafe, because it can't access _any_ memory,
 let
 alone memory that it's not supposed to be accessing, which is
 precisely what
  safe is all about.

This has been discussed to death a number of times. A field access obj.field will use addressing with a constant offset. If that offset is larger than the lowest address allowed to the application, unsafety may occur. The amount of low-address memory protected is OS-dependent. 4KB can virtually always be counted on. For fields placed beyond than that limit, a runtime test must be inserted. There are few enough 4KB objects out there to make this practically a non-issue. But the checks must be there.

Another way to deal with it is to simply disallow safe objects that are larger than 4K (or whatever the size is on the target system).

This seems like an arbitrary limitation, and one that's hard to work around without significant code surgery. I think the true solution is runtime checks for fields beyond the 4K barrier. They will be few and far between and performance-conscious coders already know well to lay out all hot data at the beginning of the object. Andrei
Feb 01 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/1/14, 1:40 PM, deadalnix wrote:
 On Saturday, 1 February 2014 at 20:09:13 UTC, Andrei Alexandrescu wrote:
 This has been discussed to death a number of times. A field access
 obj.field will use addressing with a constant offset. If that offset
 is larger than the lowest address allowed to the application, unsafety
 may occur.

That is one point. The other point is that the optimizer can remove a null check, and then a load, causing undefined behavior.

I don't understand this. Program crash is defined behavior. Andrei
Feb 01 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/1/14, 5:17 PM, deadalnix wrote:
 On Sunday, 2 February 2014 at 01:01:25 UTC, Andrei Alexandrescu wrote:
 On 2/1/14, 1:40 PM, deadalnix wrote:
 On Saturday, 1 February 2014 at 20:09:13 UTC, Andrei Alexandrescu wrote:
 This has been discussed to death a number of times. A field access
 obj.field will use addressing with a constant offset. If that offset
 is larger than the lowest address allowed to the application, unsafety
 may occur.

That is one point. The other point is that the optimizer can remove a null check, and then a load, causing undefined behavior.

I don't understand this. Program crash is defined behavior. Andrei

This has also been discussed. Let's consider the buggy code bellow: void foo(int* ptr) { *ptr; if (ptr !is null) { // do stuff } // do other stuff } Note that the code presented above look quite stupid, but this is typically what you end up with if you call 2 function, one that does a null check and one that doesn't after inlining. You would expect that the program segfault at the first line. But it is in fact undefined behavior. The optimizer can decide to remove the null check as ptr is dereferenced before so can't be null, and a later pass can remove the first deference as it is a dead load. Both GCC and LLVM optimizer can exhibit such behavior.

Do you have any pointers to substantiate that? I find such a behavior rather bizarre. Andrei
Feb 01 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen. Andrei
Feb 01 2014
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/2/2014 1:19 PM, deadalnix wrote:
 But in
 our case, it imply that the optimizer won't be able to optimize away load that
 it can't prove won't trap. That mean the compiler won"t be able to optimize
most
 load.

I do understand that issue, but I'm not sure what the solution is.
Feb 02 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/2/14, 1:23 PM, deadalnix wrote:
 On Sunday, 2 February 2014 at 10:58:51 UTC, Dicebot wrote:
 On Sunday, 2 February 2014 at 03:45:06 UTC, Andrei Alexandrescu wrote:
 On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen. Andrei

As far as I have understood previous posts, it is even worse than that - LLVM optimiser assumes that C semantics whatever high-level language is. deadalnix is that true?

It depends. For instance you can specify semantic of wrap around, so both undefined and defined overflow exists. In the precise case we are talking about about, that really do not make any sense to propose any other semantic as it would prevent the optimizer to optimize away most load.

A front-end pass could replace the dead dereference with a guard that asserts the reference is not null. More generally this is a matter that can be fixed but currently is not receiving attention by backend writers. Andrei
Feb 02 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02/03/2014 12:09 AM, Andrei Alexandrescu wrote:
 A front-end pass could replace the dead dereference with a guard that
 asserts the reference is not null.

I don't think this would be feasible. (The front-end pass would need to simulate all back-end passes in order to find all the references that might be proven dead.)
 More generally this is a matter that
 can be fixed but currently is not receiving attention by backend writers.

 Andrei

Yup.
Feb 02 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/2/14, 3:44 PM, Timon Gehr wrote:
 On 02/03/2014 12:09 AM, Andrei Alexandrescu wrote:
 A front-end pass could replace the dead dereference with a guard that
 asserts the reference is not null.

I don't think this would be feasible. (The front-end pass would need to simulate all back-end passes in order to find all the references that might be proven dead.)

Well I was thinking of a backend-specific assertion directive. Worst case, the front end could assign to a volatile global: __vglobal = *p; Andrei
Feb 02 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/2/14, 5:47 PM, deadalnix wrote:
 On Sunday, 2 February 2014 at 23:55:48 UTC, Andrei Alexandrescu wrote:
 On 2/2/14, 3:44 PM, Timon Gehr wrote:
 On 02/03/2014 12:09 AM, Andrei Alexandrescu wrote:
 A front-end pass could replace the dead dereference with a guard that
 asserts the reference is not null.

I don't think this would be feasible. (The front-end pass would need to simulate all back-end passes in order to find all the references that might be proven dead.)

Well I was thinking of a backend-specific assertion directive. Worst case, the front end could assign to a volatile global: __vglobal = *p; Andrei

As far as the backend is concerned, dereferencing and assigning to a volatile are 2 distinct operations.

No matter. The point is the dereference will not be dead to the optimizer. Andrei
Feb 02 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/3/2014 12:00 AM, Uranuz wrote:
 At the current state OS send SEGFAULT message and I can't
 even get some info where is the source of problem.

1. Compile with symbolic debug info on (-g switch) 2. Run under a debugger, such as gdb 3. When it seg faults, type: bt and it will give you a "backtrace", i.e. where it faulted, and the functions that are on the stack.
Feb 03 2014
next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 2/3/14, 5:56 AM, Walter Bright wrote:
 On 2/3/2014 12:00 AM, Uranuz wrote:
 At the current state OS send SEGFAULT message and I can't
 even get some info where is the source of problem.

1. Compile with symbolic debug info on (-g switch) 2. Run under a debugger, such as gdb 3. When it seg faults, type: bt and it will give you a "backtrace", i.e. where it faulted, and the functions that are on the stack.

You keep missing the point that when the segfault happens you might have no idea how to reproduce it. You don't even know where it happens. So it's not that easy to reach point 3. If a null access raises an exception, you immediately get the backtrace and that can help you understand why it happened.
Feb 03 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/3/2014 12:46 PM, Ary Borenszweig wrote:
 On 2/3/14, 5:56 AM, Walter Bright wrote:
 On 2/3/2014 12:00 AM, Uranuz wrote:
 At the current state OS send SEGFAULT message and I can't
 even get some info where is the source of problem.

1. Compile with symbolic debug info on (-g switch) 2. Run under a debugger, such as gdb 3. When it seg faults, type: bt and it will give you a "backtrace", i.e. where it faulted, and the functions that are on the stack.

You keep missing the point that when the segfault happens you might have no idea how to reproduce it. You don't even know where it happens. So it's not that easy to reach point 3.

The first step is ensuring that people know how to use a debugger. The second step, for the case you mentioned, is figuring out how to attach a debugger to a crashed process.
 If a null access raises an exception, you immediately get the backtrace and
that
 can help you understand why it happened.

I agree that's certainly more convenient, but I was addressing the "I can't" issue. The message did not have enough information to determine if he was having trouble with a basic issue or a more advanced case.
Feb 03 2014
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2014-02-03 11:41, Jonathan M Davis wrote:

 I recall there being a change to druntime such that it would print a backtrace
 for you if you hit a segfault, but it doesn't seem to work when I write a
 quick test program which segfaults, so I'm not sure what happened to that.

1. It only works on Linux 2. You need to call a function in a module constructor (I don't recall which one) -- /Jacob Carlborg
Feb 05 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/1/14, 7:29 PM, Jonathan M Davis wrote:
 On Saturday, February 01, 2014 12:09:10 Andrei Alexandrescu wrote:
 On 2/1/14, 2:14 AM, Jonathan M Davis wrote:
 On Saturday, February 01, 2014 04:01:50 deadalnix wrote:
 Dereferencing it is unsafe unless you put runtime check.

How is it unsafe? It will segfault and kill your program, not corrupt memory. It can't even read any memory. It's a bug to dereference a null pointer or reference, but it's not unsafe, because it can't access _any_ memory, let alone memory that it's not supposed to be accessing, which is precisely what safe is all about.

This has been discussed to death a number of times. A field access obj.field will use addressing with a constant offset. If that offset is larger than the lowest address allowed to the application, unsafety may occur. The amount of low-address memory protected is OS-dependent. 4KB can virtually always be counted on. For fields placed beyond than that limit, a runtime test must be inserted. There are few enough 4KB objects out there to make this practically a non-issue. But the checks must be there.

Hmmm. I forgot about that. So, in essence, dereferencing null pointers is almost always perfectly safe but in rare, corner cases can be unsafe. At that point, we could either always insert runtime checks for pointers to such large types or we could mark all pointers of such types system (that's not even vaguely acceptable in the general case, but it might be acceptable in a rare corner case like this). Or we could just disallow such types entirely, though it wouldn't surprise me if someone screamed over that. Runtime checks is probably the best solution, though with any of those solutions, I'd be a bit worried about there being bugs with the implementation, since we then end up with a rare, special case which is not well tested in real environments.
    Which is stupid for something that can be verified at compile time.

In the general case, you can only catch it at compile time if you disallow it completely, which is unnecessarily restrictive. Sure, some basic cases can be caught, but unless the code where the pointer/reference is defined is right next to the code where it's dereferenced, there's no way for the compiler to have any clue whether it's null or not. And yes, there's certainly code where it would make sense to use non-nullable references or pointers, because there's no need for them to be nullable, and having them be non-nullable avoids any risk of forgetting to initialize them, but that doesn't mean that nullable pointers and references aren't useful or that you can catch all instances of a null pointer or reference being dereferenced at compile time.

http://stackoverflow.com/questions/14076296/nullable-annotation-usage

Sure, and there are other things that the compiler can do to catch null dereferences (e.g. look at the first dereferencing of the pointer in the function that it's declared in and make sure that it was initialized or assigned a non-null value first), but the only way to catch all null dereferences at compile time would be to always know at compile time whether the pointer was null at the point that it's dereferenced, and that can't be done.

What are you talking about? That has been done.
 AFAIK, the only solution that guarantees that it catches all dereferences of
 null at compile time is a solution that disallows a pointer/reference from
 ever being null in the first place.

Have you read through that link? Andrei
Feb 01 2014
prev sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
On 2/1/14, 7:14 AM, Jonathan M Davis wrote:
 In the general case, you can only catch it at compile time if you disallow it
 completely, which is unnecessarily restrictive. Sure, some basic cases can be
 caught, but unless the code where the pointer/reference is defined is right
 next to the code where it's dereferenced, there's no way for the compiler to
 have any clue whether it's null or not.

This is not true. It's possible to do this, at least for the case where you dereference a variable or an object's field. See this: http://crystal-lang.org/2013/07/13/null-pointer-exception.html
Feb 03 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 1 February 2014 at 01:39:46 UTC, Jonathan M Davis 
wrote:
 On Saturday, February 01, 2014 01:14:05 Xinok wrote:
 I don't know where the community currently stands on 
 non-nullable
 types in D, so this idea may be based on a bit of ignorance.
 Assuming there are some technical issues preventing 
 non-nullable
 types from being implemented, I had a different idea that may 
 be
 somewhat of a compromise. As you've gathered by now, it's 
 simply
 to disallow nullifying references in safe code.
 
 The idea is simply that safe functions can only call other safe
 functions, so null references should be practically 
 non-existant
 ... except that's an ideal which can't be reached with this
 restriction alone. There are two obvious issues:
 
 * There's no way to guarantee input is free of null references
 * Trusted functions may return objects with null references; 
 it's
 currently not convention to avoid null references in trusted 
 code
 
 Albeit that, I think such a restriction could be helpful in
 preventing bugs/crashes and writing correct code, at least 
 until
 we can get non-nullable types.

There's nothing unsafe about null pointers/references. safe is about memory safety, and you can't corrupt memory and otherwise access memory that you're not supposed to with a null pointer or reference. At some point here, we'll have NonNullable (or NotNull whatever it ends up being called) in Phobos so that folks can have non-nullable references/pointers - e.g. NonNullable!Foo. AFAIK, the only real hold-up is someone completely a fully functional implementation. There's been at least one attempt at it, but as I understand it, there were issues that needed to be worked through before it could be accepted. We'll get there though. Regardless, we're not adding anything with regards to non-nullable references to the language itself, and there's nothing unsafe about null references. They're just unpleasant to dereference when your code makes that mistake. - Jonathan M Davis

Dereferencing it is unsafe unless you put runtime check. Which is stupid for something that can be verified at compile time.
Jan 31 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/31/14, 5:14 PM, Xinok wrote:
 I don't know where the community currently stands on non-nullable types
 in D, so this idea may be based on a bit of ignorance. Assuming there
 are some technical issues preventing non-nullable types from being
 implemented, I had a different idea that may be somewhat of a
 compromise. As you've gathered by now, it's simply to disallow
 nullifying references in safe code.

 The idea is simply that safe functions can only call other safe
 functions, so null references should be practically non-existant ...
 except that's an ideal which can't be reached with this restriction
 alone. There are two obvious issues:

 * There's no way to guarantee input is free of null references
 * Trusted functions may return objects with null references; it's
 currently not convention to avoid null references in trusted code

 Albeit that, I think such a restriction could be helpful in preventing
 bugs/crashes and writing correct code, at least until we can get
 non-nullable types.

It's an interesting idea, but I don't think it would work well for the reasons others mentioned. Andrei
Feb 01 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 01, 2014 04:01:50 deadalnix wrote:
 There's nothing unsafe about null pointers/references.  safe is
 about memory
 safety, and you can't corrupt memory and otherwise access
 memory that you're
 not supposed to with a null pointer or reference.
 
 At some point here, we'll have NonNullable (or NotNull whatever
 it ends up
 being called) in Phobos so that folks can have non-nullable
 references/pointers - e.g. NonNullable!Foo. AFAIK, the only
 real hold-up is
 someone completely a fully functional implementation. There's
 been at least
 one attempt at it, but as I understand it, there were issues
 that needed to be
 worked through before it could be accepted. We'll get there
 though.
 
 Regardless, we're not adding anything with regards to
 non-nullable references
 to the language itself, and there's nothing unsafe about null
 references.
 They're just unpleasant to dereference when your code makes
 that mistake.
 
 - Jonathan M Davis

Dereferencing it is unsafe unless you put runtime check.

How is it unsafe? It will segfault and kill your program, not corrupt memory. It can't even read any memory. It's a bug to dereference a null pointer or reference, but it's not unsafe, because it can't access _any_ memory, let alone memory that it's not supposed to be accessing, which is precisely what safe is all about.
  Which is stupid for something that can be verified at compile time.

In the general case, you can only catch it at compile time if you disallow it completely, which is unnecessarily restrictive. Sure, some basic cases can be caught, but unless the code where the pointer/reference is defined is right next to the code where it's dereferenced, there's no way for the compiler to have any clue whether it's null or not. And yes, there's certainly code where it would make sense to use non-nullable references or pointers, because there's no need for them to be nullable, and having them be non-nullable avoids any risk of forgetting to initialize them, but that doesn't mean that nullable pointers and references aren't useful or that you can catch all instances of a null pointer or reference being dereferenced at compile time. - Jonathan M Davis
Feb 01 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 1 February 2014 at 20:09:13 UTC, Andrei Alexandrescu 
wrote:
 This has been discussed to death a number of times. A field 
 access obj.field will use addressing with a constant offset. If 
 that offset is larger than the lowest address allowed to the 
 application, unsafety may occur.

That is one point. The other point is that the optimizer can remove a null check, and then a load, causing undefined behavior. The solution to that is to prevent the optimizer from removing any load unless it can prove it has no side effect (cannot trap) which is certainly something we don't want to do (for manpower reason, we probably don't want to ditch exiting optimizers, as well as for the performance hit that this imply).
Feb 01 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Saturday, 1 February 2014 at 18:58:11 UTC, Andrei Alexandrescu 
wrote:
     Widget w = fetchWidget();
     if (w)
     {
         ... here w is automatically inferred as non-null ...
     }

A library solution to this exists already: Widget wn = fetchWidget(); if(auto w = wn.checkNull) { // w implicitly converts to NotNull!Widget } I've had some difficulty in const correctness with my implementation... but const correct is an entirely separate issue anyway. It isn't quite the same as if(w) but meh, does that matter? The point of the static check is to make you think about it, and that's achieved here. If we do want to get the if(w) to work, I'd really prefer to do that as a library solution too, since then we might be able to use it elsewhere as well. Maybe some kind of template that lets you do a scoped transformation of the type. idk really.
Feb 01 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Saturday, 1 February 2014 at 18:58:11 UTC, Andrei Alexandrescu 
wrote:
 On 1/31/14, 5:39 PM, Jonathan M Davis wrote:
 Regardless, we're not adding anything with regards to 
 non-nullable references
 to the language itself [...]

I think the sea is changing here. I've collected hard data that reveals null pointer dereference is a top problem for at least certain important categories of applications. Upon discussing that data with Walter, it became clear that no amount of belief to the contrary, personal anecdote, or rhetoric can stand against the data. It also became clear that a library solution would improve things but cannot compete with a language solution. The latter can do local flow-sensitive inference and require notations only for function signatures. Consider: class Widget { ... } void fun() { // assume fetchWidget() may return null Widget w = fetchWidget(); if (w) { ... here w is automatically inferred as non-null ... } } Bottom line: a language change for non-null references is on the table. It will be an important focus of 2014. Andrei

That is excellent news.
Feb 01 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Saturday, 1 February 2014 at 20:03:40 UTC, Jonathan M Davis 
wrote:
 In the general case, you can only catch it at compile time if 
 you disallow it
 completely, which is unnecessarily restrictive.

That is not accurate. The proposal here propose to make it system instead of disallowing it completely. Even looser, I propose to make system reference passing that can be null through interface (function calls/return mostly). So you can use null locally, where the compiler can check you do not dereference it, and ensure that data coming from somewhere else is not null, unless specified as such.
 Sure, some basic cases can be
 caught, but unless the code where the pointer/reference is 
 defined is right
 next to the code where it's dereferenced, there's no way for 
 the compiler to
 have any clue whether it's null or not. And yes, there's 
 certainly code where
 it would make sense to use non-nullable references or pointers, 
 because
 there's no need for them to be nullable, and having them be 
 non-nullable
 avoids any risk of forgetting to initialize them, but that 
 doesn't mean that
 nullable pointers and references aren't useful or that you can 
 catch all
 instances of a null pointer or reference being dereferenced at 
 compile time.

 - Jonathan M Davis

Feb 01 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 2 February 2014 at 01:01:25 UTC, Andrei Alexandrescu 
wrote:
 On 2/1/14, 1:40 PM, deadalnix wrote:
 On Saturday, 1 February 2014 at 20:09:13 UTC, Andrei 
 Alexandrescu wrote:
 This has been discussed to death a number of times. A field 
 access
 obj.field will use addressing with a constant offset. If that 
 offset
 is larger than the lowest address allowed to the application, 
 unsafety
 may occur.

That is one point. The other point is that the optimizer can remove a null check, and then a load, causing undefined behavior.

I don't understand this. Program crash is defined behavior. Andrei

This has also been discussed. Let's consider the buggy code bellow: void foo(int* ptr) { *ptr; if (ptr !is null) { // do stuff } // do other stuff } Note that the code presented above look quite stupid, but this is typically what you end up with if you call 2 function, one that does a null check and one that doesn't after inlining. You would expect that the program segfault at the first line. But it is in fact undefined behavior. The optimizer can decide to remove the null check as ptr is dereferenced before so can't be null, and a later pass can remove the first deference as it is a dead load. Both GCC and LLVM optimizer can exhibit such behavior. Dereferencing null is not guaranteed to segfault, unless we impose restriction on the optimizer such as do not optimize a load away unless you can prove it won't trap, which is almost impossible to know for the compiler. As a result, you won't be able to optimize most loads away. Unless we are willing to impose such restriction on the optimizer (understand recode several passes of existing optimizer or do not rely on them, which is a huge manpower cost, and accept poorer performences) dereferencing null is undefined behavior and can't be guaranteed to crash.
Feb 01 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 02, 2014 00:40:16 deadalnix wrote:
 On Saturday, 1 February 2014 at 20:03:40 UTC, Jonathan M Davis
 
 wrote:
 In the general case, you can only catch it at compile time if
 you disallow it
 completely, which is unnecessarily restrictive.

That is not accurate. The proposal here propose to make it system instead of disallowing it completely. Even looser, I propose to make system reference passing that can be null through interface (function calls/return mostly). So you can use null locally, where the compiler can check you do not dereference it, and ensure that data coming from somewhere else is not null, unless specified as such.

Yes, and making pointers system is too restrictive. For instance, AAs _rely_ on the ability to have nullable pointers. That's how their in operator works. The same is likely to go for any in operator that's looking to be efficient. Do you propose that the in operator be system? It's too restrictive for pointers or references to be system and making them system under any kind of normal circumstances would be a huge blow to safe. We protect safe code from problems with pointers and references by disallowing unsafe values from being assigned to them and by disallowing situations which can result in their valid, safe values becoming invalid, and thus unsafe. Passing a pointer to a function should not suddenly make it system. And on top of the considerations for what safe is supposed to be and do, it would be a real pain if an safe function which initialized a pointer with a valid value then ended up with the function it passed that pointer to being unsafe just because that function doesn't know whether it was given a valid pointer or not, because safe functions can't call system functions. If we want to make pointers and references safe, it needs to be when their values are set, and that needs to include null as an safe value. Andrei raised _one_ case where it's possible for a null pointer to access memory that it shouldn't - where the object is over 4K (which is ridiculously large). If that can cause problems, then maybe having a pointer to an object like that could be considered system, but having int* suddenly be system because of potential null pointers pretty much completely shoots safe in the foot with regards to pointers. - Jonathan M Davis
Feb 01 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 01, 2014 12:09:10 Andrei Alexandrescu wrote:
 On 2/1/14, 2:14 AM, Jonathan M Davis wrote:
 On Saturday, February 01, 2014 04:01:50 deadalnix wrote:
 Dereferencing it is unsafe unless you put runtime check.

How is it unsafe? It will segfault and kill your program, not corrupt memory. It can't even read any memory. It's a bug to dereference a null pointer or reference, but it's not unsafe, because it can't access _any_ memory, let alone memory that it's not supposed to be accessing, which is precisely what safe is all about.

This has been discussed to death a number of times. A field access obj.field will use addressing with a constant offset. If that offset is larger than the lowest address allowed to the application, unsafety may occur. The amount of low-address memory protected is OS-dependent. 4KB can virtually always be counted on. For fields placed beyond than that limit, a runtime test must be inserted. There are few enough 4KB objects out there to make this practically a non-issue. But the checks must be there.

Hmmm. I forgot about that. So, in essence, dereferencing null pointers is almost always perfectly safe but in rare, corner cases can be unsafe. At that point, we could either always insert runtime checks for pointers to such large types or we could mark all pointers of such types system (that's not even vaguely acceptable in the general case, but it might be acceptable in a rare corner case like this). Or we could just disallow such types entirely, though it wouldn't surprise me if someone screamed over that. Runtime checks is probably the best solution, though with any of those solutions, I'd be a bit worried about there being bugs with the implementation, since we then end up with a rare, special case which is not well tested in real environments.
   Which is stupid for something that can be verified at compile time.

In the general case, you can only catch it at compile time if you disallow it completely, which is unnecessarily restrictive. Sure, some basic cases can be caught, but unless the code where the pointer/reference is defined is right next to the code where it's dereferenced, there's no way for the compiler to have any clue whether it's null or not. And yes, there's certainly code where it would make sense to use non-nullable references or pointers, because there's no need for them to be nullable, and having them be non-nullable avoids any risk of forgetting to initialize them, but that doesn't mean that nullable pointers and references aren't useful or that you can catch all instances of a null pointer or reference being dereferenced at compile time.

http://stackoverflow.com/questions/14076296/nullable-annotation-usage

Sure, and there are other things that the compiler can do to catch null dereferences (e.g. look at the first dereferencing of the pointer in the function that it's declared in and make sure that it was initialized or assigned a non-null value first), but the only way to catch all null dereferences at compile time would be to always know at compile time whether the pointer was null at the point that it's dereferenced, and that can't be done. AFAIK, the only solution that guarantees that it catches all dereferences of null at compile time is a solution that disallows a pointer/reference from ever being null in the first place. - Jonathan M Davis
Feb 01 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 2 February 2014 at 03:27:21 UTC, Andrei Alexandrescu 
wrote:
 On 2/1/14, 5:17 PM, deadalnix wrote:
 On Sunday, 2 February 2014 at 01:01:25 UTC, Andrei 
 Alexandrescu wrote:
 On 2/1/14, 1:40 PM, deadalnix wrote:
 On Saturday, 1 February 2014 at 20:09:13 UTC, Andrei 
 Alexandrescu wrote:
 This has been discussed to death a number of times. A field 
 access
 obj.field will use addressing with a constant offset. If 
 that offset
 is larger than the lowest address allowed to the 
 application, unsafety
 may occur.

That is one point. The other point is that the optimizer can remove a null check, and then a load, causing undefined behavior.

I don't understand this. Program crash is defined behavior. Andrei

This has also been discussed. Let's consider the buggy code bellow: void foo(int* ptr) { *ptr; if (ptr !is null) { // do stuff } // do other stuff } Note that the code presented above look quite stupid, but this is typically what you end up with if you call 2 function, one that does a null check and one that doesn't after inlining. You would expect that the program segfault at the first line. But it is in fact undefined behavior. The optimizer can decide to remove the null check as ptr is dereferenced before so can't be null, and a later pass can remove the first deference as it is a dead load. Both GCC and LLVM optimizer can exhibit such behavior.

Do you have any pointers to substantiate that? I find such a behavior rather bizarre. Andrei

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html
Feb 01 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 2 February 2014 at 03:35:25 UTC, deadalnix wrote:
 Do you have any pointers to substantiate that? I find such a 
 behavior rather bizarre.

 Andrei

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Also it has to be noted that this very phenomena caused a security flaw in the linux kernel recently, but can't find the link. Anyway, that isn't just a theoretical possibility. However rare, it happens in practice.
Feb 01 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 01, 2014 10:58:12 Andrei Alexandrescu wrote:
 On 1/31/14, 5:39 PM, Jonathan M Davis wrote:
 Regardless, we're not adding anything with regards to non-nullable
 references to the language itself [...]

I think the sea is changing here. I've collected hard data that reveals null pointer dereference is a top problem for at least certain important categories of applications. Upon discussing that data with Walter, it became clear that no amount of belief to the contrary, personal anecdote, or rhetoric can stand against the data.

I'm not sure how I feel about that, particularly since I haven't seen such data myself. My natural reaction when people complain about null pointer problems is that they're sloppy programmers (which isn't necessarily fair, but that's my natural reaction). I pretty much never have problems with null pointers and have never understood why so many people complain about them. Maybe I'm just more organized in how I deal with null than many folks are. I'm not against having non-nullable pointers/references so long as we still have the nullable ones, and a lot of people (or at least a number of very vocal people) want them, but I'm not particularly enthusiastic about the idea either.
 It also became clear that a library solution would improve things but
 cannot compete with a language solution. The latter can do local
 flow-sensitive inference and require notations only for function
 signatures. Consider:
 
 class Widget { ... }
 
 void fun()
 {
      // assume fetchWidget() may return null
      Widget w = fetchWidget();
      if (w)
      {
          ... here w is automatically inferred as non-null ...
      }
 }

Yeah, I think that it's always been clear that a library solution would be inferior to language one. It's just that we could get a large portion of the benefits with a library solution without having to make any language changes. It essentially comes down to a question of whether the additional benefits of having non-nullable references in the language are worth the additional costs that that incurs.
 Bottom line: a language change for non-null references is on the table.
 It will be an important focus of 2014.

Well, I guess that I'll just have to wait and see what gets proposed. - Jonathan M Davis
Feb 01 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Sunday, 2 February 2014 at 00:50:28 UTC, Timon Gehr wrote:
 if(auto w = wn.checkNull){
     // ...
 }else w.foo();

(presumably you meant wn.foo, as w would be out of scope in the else branch) This is more a question of the default than the library though, as all other things being equal, changing the language so if(w) { /* w changes type */ }, still lets w.foo compile. I have a radical idea about Foo, not sure if people would like it though... I'd like like the naked type to mean "borrowed, not null". Then, Nullable!Foo is semi-magical in implementation, but semantically is basically struct Nullable(T) { bool isNull; T object; }. It would *not* be a specialization of Foo; it does not implicitly convert and is not usable out of the box. You'd be forced to convert by checking with the if(w) kind of thing. The magic is that Nullable!Foo is implemented as a plain old pointer whenever possible instead of a pointer+bool pair. Probably nothing special there, all stuff you've heard of before and it is ground that C# IMO has covered fairly well. The other thing I want though is to make it borrowed by default. A borrowed non-immutable object cannot be stored except in the scope of the owner. To store it, you have to assert ownership somehow: is it owned by the GC? Reference counted? Scoped, RAII style? Or DIY C style? You have to mark it one way or another. (This also applies to slices btw) I say non-immutable since immutability means it never changes, which means it is never freed, which means ownership is irrelevant. In practice, this means all immutable objects are either references to static data or managed by the GC (which provides the illusion of an infinite lifetime). This would be a radical change by default... but then again, so would not-null by default, so hey do it together I say.
Feb 01 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Saturday, 1 February 2014 at 22:05:21 UTC, Adam D. Ruppe wrote:
 On Saturday, 1 February 2014 at 18:58:11 UTC, Andrei 
 Alexandrescu wrote:
    Widget w = fetchWidget();
    if (w)
    {
        ... here w is automatically inferred as non-null ...
    }

A library solution to this exists already: Widget wn = fetchWidget(); if(auto w = wn.checkNull) { // w implicitly converts to NotNull!Widget } I've had some difficulty in const correctness with my implementation... but const correct is an entirely separate issue anyway. It isn't quite the same as if(w) but meh, does that matter? The point of the static check is to make you think about it, and that's achieved here. If we do want to get the if(w) to work, I'd really prefer to do that as a library solution too, since then we might be able to use it elsewhere as well. Maybe some kind of template that lets you do a scoped transformation of the type. idk really.

This is a common staple of languages with more advanced type systems that support flow-sensitive typing. Ceylon is an upcoming language that I'm quite excited about that features this. "Typesafe null and flow-sensitive typing" section http://www.ceylon-lang.org/documentation/1.0/introduction/
Feb 01 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 01, 2014 19:40:26 Andrei Alexandrescu wrote:
 On 2/1/14, 7:29 PM, Jonathan M Davis wrote:
 Sure, and there are other things that the compiler can do to catch null
 dereferences (e.g. look at the first dereferencing of the pointer in the
 function that it's declared in and make sure that it was initialized or
 assigned a non-null value first), but the only way to catch all null
 dereferences at compile time would be to always know at compile time
 whether the pointer was null at the point that it's dereferenced, and
 that can't be done.

What are you talking about? That has been done.

How so? All that's required is something like auto ptr = foo(); if(bar()) //runtime dependent result ptr = null; ptr.baz(); and there's no way that the compiler could know whether this code is going to dereference null or not. Sure, it could flag it as possible and thus buggy, but it can't know for sure. And an even simpler case is something like auto ptr = foo(); ptr.baz(); The compiler isn't going to know whether foo returns null or not, and whether it returns null could depend on the state at runtime, making it so that it can't possibly know whether foo will return null or not. And even if foo were as simple as Bar* foo() { return null; } the compiler would have to actually look at the body of foo to know that its return value was going to be null, and compilers that use the C compilation model don't normally do that, because they often don't have the body of foo available. So, maybe I'm missing something here, and maybe we're just not quite talking about the same thing, but it's my understanding that it is not possible for the compiler to always know whether a pointer or reference is going to be null when it's dereferenced unless it's actually illegal for that pointer or reference type to be null (be it because of its type is non-nullable or an annotation marks it as such - which effectively then changes its type to non- nullable). As soon as you're dealing with an actual, nullable pointer or reference, it's trivial to make it so that the compiler doesn't have any idea what the pointer's value could be and thus can't know whether it's null or not. Sections of code which include the initialization or assignment of pointers which are given either new or null certainly can be examined by the compiler so that it can determine whether any dereferencing of that pointer that occurs could be dereferencing null, but something as simple as making it so that the pointer was initialized from the return value of a function makes it so that it can't.
 AFAIK, the only solution that guarantees that it catches all dereferences
 of null at compile time is a solution that disallows a pointer/reference
 from ever being null in the first place.

Have you read through that link?

Yes. And all I see is that Nullable makes it so that some frameworks won't accept null without Nullable, which effectively turns a normal Java reference into a non-nullable reference. I don't see anything indicating that the compiler will have any clue whether an Nullable reference is null or not at compile time. My point is that when a nullable pointer or reference is dereferenced, it's impossible to _always_ be able to determine at compile time whether it's going to dereference null. Some of the time you can, and using non-nullable, references certainly makes it so that you can, because then it's illegal for them to be null, but once a reference is nullable, unless the compiler knows the entire code path that that pointer's value goes through, and that code path is guaranteed to result in a null pointer or it's guaranteed to _not_ result in a null pointer, the compiler can't know whether the pointer is going to be null or not at the point that it's dereferenced. And you can't even necessarily do that with full program analysis due to the values potentially depending on runtime state. You could determine whether it _could_ be null if you had full program analysis, but you can't determine for certain that it will or won't be - not in all circumstances. And without full program analysis, you can't even do that in most cases, not with nullable references. But maybe I'm just totally missing something here. I suspect though that we're just not communicating clearly enough for us to quite get what the other is saying. - Jonathan M Davis
Feb 01 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 01, 2014 19:44:44 Andrei Alexandrescu wrote:
 On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen.

I think that article clearly illustrates that some of Walter's decisions in D with regards to fully defining some stuff that C didn't define were indeed correct. Undefined behavior is your enemy, and clearly, it gets even worse when the optimizer gets involved. *shudder* - Jonathan M Davis
Feb 01 2014
prev sibling next sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Sunday, 2 February 2014 at 07:54:26 UTC, Jonathan M Davis 
wrote:
 On Saturday, February 01, 2014 19:44:44 Andrei Alexandrescu 
 wrote:
 On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen.

I think that article clearly illustrates that some of Walter's decisions in D with regards to fully defining some stuff that C didn't define were indeed correct. Undefined behavior is your enemy, and clearly, it gets even worse when the optimizer gets involved. *shudder*

Even without undefined behaviour, i.e. a guarantee that null-dereference leads to a segfault, the optimizer can deduce the pointer to be non-null after the dereference. Otherwise the code there could never be reached, because the program would have aborted. This in turn can cause the dereference to be optimized away, if its result is never used any more (dead store): auto x = *p; if(!p) { do_something(x); } In the first step, the if-block will be removed, because its condition is "known" to be false. After that, the value stored into x is unused, and the dereference can get removed too.
Feb 02 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Sunday, 2 February 2014 at 03:45:06 UTC, Andrei Alexandrescu 
wrote:
 On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen. Andrei

As far as I have understood previous posts, it is even worse than that - LLVM optimiser assumes that C semantics whatever high-level language is. deadalnix is that true?
Feb 02 2014
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 02, 2014 12:52:44 Timon Gehr wrote:
 On 02/02/2014 04:39 AM, Jonathan M Davis wrote:
 I'm not sure how I feel about that, particularly since I haven't seen such
 data myself. My natural reaction when people complain about null pointer
 problems is that they're sloppy programmers (which isn't necessarily fair,
 but that's my natural reaction).

There is no such thing as 'naturality' that is magically able to justify personal attacks in a technical discussion, even if qualified.

Would you prefer that I had said "initial reaction" or "gut reaction?" I'm just saying that that's how I tend to feel when I see complaints about null pointers. I have never accused anyone of anything or otherwise attacked them because they complained about null pointers. That _would_ be rude.
 The situation does not have to be unbearable in order to
 improve it!

True, but I don't even agree that null pointers are that big a deal in the first place. If we really want to add non-nullable pointers or references to the language, then we can. I don't think that that's necessarily a bad idea. But I doubt that I'll use them often, and I do think that the whole issue is frequently blown out of proportion. - Jonathan M Davis
Feb 02 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/3/14, 9:09 PM, Jonathan M Davis wrote:
 I truly hope that that's never the case. Adding non-nullable references to the
 language is one thing; making them the default is quite another, and making
 them the default would break existing code. And given Walter's normal stance
 on code breakage, I'd be very surprised if he were in favor of making non-
 nullable references or pointers the default.

We are considering making non-nullables the default, nullable to mark optionally null objects, and enable the related checks with an opt-in compiler flag. Andrei
Feb 03 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/3/14, 10:22 PM, deadalnix wrote:
 On Tuesday, 4 February 2014 at 06:03:14 UTC, Andrei Alexandrescu wrote:
 On 2/3/14, 9:09 PM, Jonathan M Davis wrote:
 I truly hope that that's never the case. Adding non-nullable
 references to the
 language is one thing; making them the default is quite another, and
 making
 them the default would break existing code. And given Walter's normal
 stance
 on code breakage, I'd be very surprised if he were in favor of making
 non-
 nullable references or pointers the default.

We are considering making non-nullables the default, nullable to mark optionally null objects, and enable the related checks with an opt-in compiler flag. Andrei

That would be awesome. The breakage involved, is quite high however.

No breakage if the opt-in flag is not used. Andrei
Feb 03 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/4/14, 12:32 AM, deadalnix wrote:
 On Tuesday, 4 February 2014 at 06:49:57 UTC, Andrei Alexandrescu wrote:
 That would be awesome. The breakage involved, is quite high however.

No breakage if the opt-in flag is not used. Andrei

OK, If you are willing to do that change, I'm 200% behind ! Question, why do you propose to use nullable instead of Nullable!T ?

Because it will be inferred locally, and we have precedent for that with attributes but not with type constructors. Andrei
Feb 04 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/4/14, 6:54 AM, Adam D. Ruppe wrote:
 On Tuesday, 4 February 2014 at 14:34:49 UTC, Idan Arye wrote:
 Probably because `Nullable!` suggests that's it's a library solution -
 and it isn't.

It should be. The way I'd do it is Object o; // not null nullable Object o; // like we have today BUT, user code would never use that. Instead, we'd have: struct Nullable(T) if(__traits(compiles, ( nullable T) {}) { nullable T; }

Yah, that's what I have in mind. Andrei
Feb 04 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Sunday, 2 February 2014 at 12:42:47 UTC, Jonathan M Davis 
wrote:
 But I doubt that I'll use them often, and I do think that the 
 whole issue is
 frequently blown out of proportion.

 - Jonathan M Davis

I agree that there is no much benefit in opt-in null-free pointers. But if making those opt-out would have been possible, I'd love it. Breaks every single D program out there though.
Feb 02 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Sunday, 2 February 2014 at 13:18:19 UTC, Nick Treleaven wrote:
 I read your recent post about this, it was interesting. But I 
 don't think you can disallow this:

     auto cn = checkNull(cast(C)null);
     NotNull!C nn = cn;

 obj2 is then null, when it shouldn't be allowed.

It wouldn't be null - it would be a runtime assertion failure (except in release mode, when it would indeed be null). I think that's OK. Two reasons this is an improvement anyway: 1) The error message of "cannot implicitly convert cn of type C to nn of type NotNull!C" made you realize there's a potential problem here and attempt a fix. That's why checkNull is there in the first place - it made you consider the problem. Your fix isn't really right, but at least now it should be obvious why. 2) The assertion failure happens right there at the assignment point (the assert is in NotNull's constructor) instead of at the use point. So when it fails at runtime, you don't have to work backwards to figure out where null was introduced, it points you straight at it and it won't be too hard to see that cn wasn't properly checked. Maybe not perfect, but I really think it is good enough and an improvement.... if people actually use NotNull!T in their functions and structures in the first place.
Feb 02 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Sunday, 2 February 2014 at 13:15:42 UTC, Dicebot wrote:
 I agree that there is no much benefit in opt-in null-free 
 pointers. But if making those opt-out would have been possible, 
 I'd love it. Breaks every single D program out there though.

This isn't necessarily so bad. My biggest chunk of code that uses classes is probably my dom.d.... and there's only... I think six functions in there that can return null, and only one property that can be null. (In fact, I think like 1/5 of the lines in there are contracts and invariants relating to null anyway. Once I was getting segfaults because of a corrupted tree and that's ultimately how I tracked it down.) So if "Element" were changed to be not null by default, the majority of the program should still compile! Then it is a simple case of looking at the compiler errors complaining about assigning null and throw in the Nullable! thingy which shouldn't take that long. Code like auto a = new A(); a.foo(); needn't break. Really, I think it would be likely to find more bugs, or at least save time writing dozens of contracts - it would be the "worth it" kind of breakage.
Feb 02 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Sunday, 2 February 2014 at 13:55:11 UTC, Adam D. Ruppe wrote:
 On Sunday, 2 February 2014 at 13:15:42 UTC, Dicebot wrote:
 I agree that there is no much benefit in opt-in null-free 
 pointers. But if making those opt-out would have been 
 possible, I'd love it. Breaks every single D program out there 
 though.

This isn't necessarily so bad. My biggest chunk of code that uses classes is probably my dom.d.... and there's only... I think six functions in there that can return null, and only one property that can be null. (In fact, I think like 1/5 of the lines in there are contracts and invariants relating to null anyway. Once I was getting segfaults because of a corrupted tree and that's ultimately how I tracked it down.) So if "Element" were changed to be not null by default, the majority of the program should still compile! Then it is a simple case of looking at the compiler errors complaining about assigning null and throw in the Nullable! thingy which shouldn't take that long. Code like auto a = new A(); a.foo(); needn't break. Really, I think it would be likely to find more bugs, or at least save time writing dozens of contracts - it would be the "worth it" kind of breakage.

I think it's safe to assume that you - being a supporter of the non-null movement - write your own code in a way that tries to avoid the usage of null as much as possible. Other people - like me - treat null as a valid value. If I have a class\struct `Foo` with a member field `bar` of type `Bar`, and an instance of `Foo` named `foo` that happens to have no `Bar`, I'll not add an extra boolean field just to indicate that `foo` has no `Bar` - I'll simply set `foo.bar` to null! And I'll use the fact that null is D is false and the `if(auto foobar=foo.bar)` will only enter the block if `foo.bar` is not null, and `foobar` will only be declared in that block, when it's guaranteed to not be null. And I'll use the fact that UFCS works perfectly fine when the first argument is null to build functions that accept `Bar` as first argument and do the null checking internally(if it's needed!) and safely call them on `foo.bar`. So yea, your code won't break. That doesn't mean other code won't break.
Feb 02 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Sunday, 2 February 2014 at 12:42:47 UTC, Jonathan M Davis 
wrote:
 On Sunday, February 02, 2014 12:52:44 Timon Gehr wrote:
 On 02/02/2014 04:39 AM, Jonathan M Davis wrote:
 I'm not sure how I feel about that, particularly since I 
 haven't seen such
 data myself. My natural reaction when people complain about 
 null pointer
 problems is that they're sloppy programmers (which isn't 
 necessarily fair,
 but that's my natural reaction).

There is no such thing as 'naturality' that is magically able to justify personal attacks in a technical discussion, even if qualified.

Would you prefer that I had said "initial reaction" or "gut reaction?" I'm just saying that that's how I tend to feel when I see complaints about null pointers. I have never accused anyone of anything or otherwise attacked them because they complained about null pointers. That _would_ be rude.
 The situation does not have to be unbearable in order to
 improve it!

True, but I don't even agree that null pointers are that big a deal in the first place. If we really want to add non-nullable pointers or references to the language, then we can. I don't think that that's necessarily a bad idea. But I doubt that I'll use them often, and I do think that the whole issue is frequently blown out of proportion. - Jonathan M Davis

Ideally you'd be using them wherever you use objects and pointers, as they'd be the default.
Feb 02 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Sunday, 2 February 2014 at 15:06:34 UTC, Idan Arye wrote:
 I think it's safe to assume that you - being a supporter of the 
 non-null movement - write your own code in a way that tries to 
 avoid the usage of null as much as possible.

You'd be wrong - I was against the not null thing for a long time, including while writing dom.d.
 If I have a class\struct `Foo` with a member field `bar` of 
 type `Bar`, and an instance of `Foo` named `foo` that happens 
 to have no `Bar`, I'll not add an extra boolean field just to 
 indicate that `foo` has no `Bar` - I'll simply set `foo.bar` to 
 null!

Me too, that's exactly what I did with Element parentNode for instance.
 And I'll use the fact that UFCS works perfectly fine when the 
 first argument is null to build functions that accept `Bar` as 
 first argument and do the null checking internally(if it's 
 needed!) and safely call them on `foo.bar`.

Again. me too. Some of my code would break with not null by default, but the amazing thing is it really isn't the majority of it, and since the compiler error would point just where it is, adding the Nullable! to the type is fairly easy.
Feb 02 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Sunday, 2 February 2014 at 18:33:05 UTC, Adam D. Ruppe wrote:
 On Sunday, 2 February 2014 at 15:06:34 UTC, Idan Arye wrote:
 I think it's safe to assume that you - being a supporter of 
 the non-null movement - write your own code in a way that 
 tries to avoid the usage of null as much as possible.

You'd be wrong - I was against the not null thing for a long time, including while writing dom.d.
 If I have a class\struct `Foo` with a member field `bar` of 
 type `Bar`, and an instance of `Foo` named `foo` that happens 
 to have no `Bar`, I'll not add an extra boolean field just to 
 indicate that `foo` has no `Bar` - I'll simply set `foo.bar` 
 to null!

Me too, that's exactly what I did with Element parentNode for instance.
 And I'll use the fact that UFCS works perfectly fine when the 
 first argument is null to build functions that accept `Bar` as 
 first argument and do the null checking internally(if it's 
 needed!) and safely call them on `foo.bar`.

Again. me too. Some of my code would break with not null by default, but the amazing thing is it really isn't the majority of it, and since the compiler error would point just where it is, adding the Nullable! to the type is fairly easy.

OK, I see now. What you say is that even if some code will break, it'll be easy to refactor because the compiler will easily pinpoint the locations where `Nullable1!` should be added. Well, I don't think it'll be that straightforward. In order for non-nullable-by-default to mean something, most APIs will need to use it and not automatically use `Nullable!`. While non-nullabe can be implicitly cast to nullable, the reverse is not true, and whenever a code fails to compile because it sends nullable-typed value to as a non-nullable argument it can't be fixed automatically - you'll need to check for nulls, and to actually decide what to do when the value is null. Now, this is doable in your own code, but what if you use a third party library? That you are not familiar with it's source? Automatically downloaded from a repository that you have commit-rights to?
Feb 02 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 2 February 2014 at 07:54:26 UTC, Jonathan M Davis 
wrote:
 On Saturday, February 01, 2014 19:44:44 Andrei Alexandrescu 
 wrote:
 On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen.

I think that article clearly illustrates that some of Walter's decisions in D with regards to fully defining some stuff that C didn't define were indeed correct. Undefined behavior is your enemy, and clearly, it gets even worse when the optimizer gets involved. *shudder* - Jonathan M Davis

What you don't seem to understand is the associated cost. Defining integer overflow to wrap around is easy and do not cost much. But in our case, it imply that the optimizer won't be able to optimize away load that it can't prove won't trap. That mean the compiler won"t be able to optimize most load.
Feb 02 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 2 February 2014 at 10:58:51 UTC, Dicebot wrote:
 On Sunday, 2 February 2014 at 03:45:06 UTC, Andrei Alexandrescu 
 wrote:
 On 2/1/14, 7:35 PM, deadalnix wrote:
 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

Whoa, thanks. So the compiler figures null pointer dereference in C is undefined behavior, which means the entire program could do whatever if that does happen. Andrei

As far as I have understood previous posts, it is even worse than that - LLVM optimiser assumes that C semantics whatever high-level language is. deadalnix is that true?

It depends. For instance you can specify semantic of wrap around, so both undefined and defined overflow exists. In the precise case we are talking about about, that really do not make any sense to propose any other semantic as it would prevent the optimizer to optimize away most load.
Feb 02 2014
prev sibling next sibling parent "Tove" <Tove fransson.se> writes:
On Sunday, 2 February 2014 at 09:56:06 UTC, Marc Schütz wrote:
 auto x = *p;
 if(!p) {
     do_something(x);
 }

 In the first step, the if-block will be removed, because its 
 condition is "known" to be false. After that, the value stored 
 into x is unused, and the dereference can get removed too.

With a good static analyzer, such as coverity, this program would be rejected anyway with "check_after_deref", if the compiler is smart enough to do the optimization, it could be smart enough to issue a warning as well!
Feb 02 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 2 February 2014 at 23:55:48 UTC, Andrei Alexandrescu 
wrote:
 On 2/2/14, 3:44 PM, Timon Gehr wrote:
 On 02/03/2014 12:09 AM, Andrei Alexandrescu wrote:
 A front-end pass could replace the dead dereference with a 
 guard that
 asserts the reference is not null.

I don't think this would be feasible. (The front-end pass would need to simulate all back-end passes in order to find all the references that might be proven dead.)

Well I was thinking of a backend-specific assertion directive. Worst case, the front end could assign to a volatile global: __vglobal = *p; Andrei

As far as the backend is concerned, dereferencing and assigning to a volatile are 2 distinct operations.
Feb 02 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 3 February 2014 at 03:49:20 UTC, Andrei Alexandrescu 
wrote:
 No matter. The point is the dereference will not be dead to the 
 optimizer.

 Andrei

I see. But how would you add these without adding thousand of them all over the place ?
Feb 02 2014
prev sibling next sibling parent "Uranuz" <neuranuz gmail.com> writes:
I'm not sure what to do about null references. One of my ideas is 
that some type of Exception or Throwable object should be thrown 
when accessing null pointer instead of getting SEGFAULT. In this 
way programmer can handle exception at application level, get 
some information about problem and may be sometimes it's possible 
to recover. At the current state OS send SEGFAULT message and I 
can't even get some info where is the source of problem. It's the 
most annoying thing about null references for me.

Of course we can provide some wrapper struct for reference that 
will check reference for null and throw Exception if it's null. 
Another way is provide it at compiler level. The third variant is 
forbid null references somehow at all (currently it's possible 
via wrapper). As I thinking today for me 2nd variant is prefered. 
Providing Error when reading null variable will reduce code bloat.

But it's only my point of view. Since I dont know anything about 
compiler implementation it's interesting to know what do you 
think of this.
Feb 03 2014
prev sibling next sibling parent "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Saturday, 1 February 2014 at 18:58:11 UTC, Andrei Alexandrescu 
wrote:
 It also became clear that a library solution would improve 
 things but cannot compete with a language solution.

if even Andrei is asking for a language solution instead of a library one! :o)
 Bottom line: a language change for non-null references is on 
 the table. It will be an important focus of 2014.

sometimes.
Feb 03 2014
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 03, 2014 00:56:05 Walter Bright wrote:
 On 2/3/2014 12:00 AM, Uranuz wrote:
 At the current state OS send SEGFAULT message and I can't
 even get some info where is the source of problem.

1. Compile with symbolic debug info on (-g switch) 2. Run under a debugger, such as gdb 3. When it seg faults, type: bt and it will give you a "backtrace", i.e. where it faulted, and the functions that are on the stack.

I recall there being a change to druntime such that it would print a backtrace for you if you hit a segfault, but it doesn't seem to work when I write a quick test program which segfaults, so I'm not sure what happened to that. Certainly, I think that that solves the issue from the standpoint of the program simply printing out "segmentation fault" being too little info (since it would then print a stacktrace as well). But as long as that doesn't work, using gdb's the best way - and it has the advantage of letting you examine the program state at that point in time rather than just see the stacktrace. - Jonathan M Davis
Feb 03 2014
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, February 03, 2014 09:30:32 Ary Borenszweig wrote:
 On 2/1/14, 7:14 AM, Jonathan M Davis wrote:
 In the general case, you can only catch it at compile time if you disallow
 it completely, which is unnecessarily restrictive. Sure, some basic cases
 can be caught, but unless the code where the pointer/reference is defined
 is right next to the code where it's dereferenced, there's no way for the
 compiler to have any clue whether it's null or not.

This is not true. It's possible to do this, at least for the case where you dereference a variable or an object's field. See this: http://crystal-lang.org/2013/07/13/null-pointer-exception.html

In the _general_ case it is not possible. It's certainly possible within certain snippets of code - any obvious example being int* i; *i = 5; but it's not possible for the compiler to know under all sets of circumstances. Something as simple as using a function's return value makes it so that it can't know. e.g. int* i = foo(); *i = 5; For it to know, it would have to examine the body of foo (which it doesn't necessarily have the code for under C's compilation model - which D uses), and even if it did that wouldn't be enough e.g. int* foo() {    return "/etc/foo".exists ? new int : null; } The compiler could flag that as _possibly_ returning null and therefore the previous code _possibly_ dereferencing null, but it can't know for sure. So, with full program analysis, it would certainly be possible to determine whether a particular pointer _might_ be null when it's dereferenced and flag that as a warning or error, but without full program analysis (which the compiler can't normally do with C's compilation model), you can't do that in most cases, and even with full program analysis, you can't always determine whether a pointer will definitively be null or not when it's dereferenced. Crystal's compilation model may allow it to check a lot more code than C's does, allowing it to detect null dereferences in more cases, and it may very well simply error out when it detects that a pointer _might_ be null when it's dereferenced, but as I understand it, it's fundamentally impossible to determine at compile time under all circumstances whether a pointer will definitively be null or not (as evidenced by the body of foo above - the result depends entirely on runtime state). - Jonathan M Davis
Feb 03 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Monday, 3 February 2014 at 22:18:35 UTC, Jonathan M Davis 
wrote:
 For it to know, it would have to examine the body of foo (which 
 it doesn't
 necessarily have the code for under C's compilation model - 
 which D uses), and
 even if it did that wouldn't be enough e.g.

 int* foo()
 {
     return "/etc/foo".exists ? new int : null;
 }

 The compiler could flag that as _possibly_ returning null and 
 therefore the
 previous code _possibly_ dereferencing null, but it can't know 
 for sure.

If null is an invalid value to assign to a pointer, then there's no issue. int* foo() { //Error: cannot implicitly convert typeof(null) to type int* return "/etc/foo".exists ? new int : null; }
Feb 03 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
This is fantastic news, Andrei!

It's all about provably correct code like Walter often brings up 
on his talks.

If/When this arrives, D will become an even more suitable 
replacement for safety critical languages like Ada.
Feb 03 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 3 February 2014 at 22:23:52 UTC, Meta wrote:
 If null is an invalid value to assign to a pointer, then 
 there's no issue.

 int* foo()
 {
     //Error: cannot implicitly convert typeof(null) to type int*
     return "/etc/foo".exists ? new int : null;
 }

Only cross abstraction boundaries is sufficient.
Feb 03 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Monday, 3 February 2014 at 23:34:59 UTC, deadalnix wrote:
 On Monday, 3 February 2014 at 22:23:52 UTC, Meta wrote:
 If null is an invalid value to assign to a pointer, then 
 there's no issue.

 int* foo()
 {
    //Error: cannot implicitly convert typeof(null) to type int*
    return "/etc/foo".exists ? new int : null;
 }

Only cross abstraction boundaries is sufficient.

Can you explain?
Feb 03 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 4 February 2014 at 01:09:52 UTC, Meta wrote:
 On Monday, 3 February 2014 at 23:34:59 UTC, deadalnix wrote:
 On Monday, 3 February 2014 at 22:23:52 UTC, Meta wrote:
 If null is an invalid value to assign to a pointer, then 
 there's no issue.

 int* foo()
 {
   //Error: cannot implicitly convert typeof(null) to type int*
   return "/etc/foo".exists ? new int : null;
 }

Only cross abstraction boundaries is sufficient.

Can you explain?

void foo() { Widget w = null; // OK w.foo(); // Error w might be null. w = new Widget(); w.foo(); // OK if(condition) { w = null; } w.foo(); // Error if(w !is null) { w.foo(); // OK } bar(w); // Error, w might be null= return w; // Error, w might be null }
Feb 03 2014
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, February 03, 2014 22:23:51 Meta wrote:
 On Monday, 3 February 2014 at 22:18:35 UTC, Jonathan M Davis
 
 wrote:
 For it to know, it would have to examine the body of foo (which
 it doesn't
 necessarily have the code for under C's compilation model -
 which D uses), and
 even if it did that wouldn't be enough e.g.
 
 int* foo()
 {
 
 return "/etc/foo".exists ? new int : null;
 
 }
 
 The compiler could flag that as _possibly_ returning null and
 therefore the
 previous code _possibly_ dereferencing null, but it can't know
 for sure.

If null is an invalid value to assign to a pointer, then there's no issue.

Yes, but I wasn't talking about non-nullable pointers. I was talking about how in the general case, it's impossible to determine at compile time whether a nullable pointer is null and that it's therefore impossible (in the general case) to determine at compile time whether dereferencing a nullable pointer will attempt to dereference null. Non-nullable pointers side-steps the issue entirely. - Jonathan M Davis
Feb 03 2014
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Sunday, February 02, 2014 17:41:58 Meta wrote:
 True, but I don't even agree that null pointers are that big a
 deal in the
 first place. If we really want to add non-nullable pointers or
 references to
 the language, then we can. I don't think that that's
 necessarily a bad idea.
 But I doubt that I'll use them often, and I do think that the
 whole issue is
 frequently blown out of proportion.
 
 - Jonathan M Davis

Ideally you'd be using them wherever you use objects and pointers, as they'd be the default.

I truly hope that that's never the case. Adding non-nullable references to the language is one thing; making them the default is quite another, and making them the default would break existing code. And given Walter's normal stance on code breakage, I'd be very surprised if he were in favor of making non- nullable references or pointers the default. - Jonathan M Davis
Feb 03 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 4 February 2014 at 06:03:14 UTC, Andrei Alexandrescu 
wrote:
 On 2/3/14, 9:09 PM, Jonathan M Davis wrote:
 I truly hope that that's never the case. Adding non-nullable 
 references to the
 language is one thing; making them the default is quite 
 another, and making
 them the default would break existing code. And given Walter's 
 normal stance
 on code breakage, I'd be very surprised if he were in favor of 
 making non-
 nullable references or pointers the default.

We are considering making non-nullables the default, nullable to mark optionally null objects, and enable the related checks with an opt-in compiler flag. Andrei

That would be awesome. The breakage involved, is quite high however.
Feb 03 2014
prev sibling next sibling parent "Joseph Cassman" <jc7919 outlook.com> writes:
On Tuesday, 4 February 2014 at 06:03:14 UTC, Andrei Alexandrescu 
wrote:
 On 2/3/14, 9:09 PM, Jonathan M Davis wrote:
 I truly hope that that's never the case. Adding non-nullable 
 references to the
 language is one thing; making them the default is quite 
 another, and making
 them the default would break existing code. And given Walter's 
 normal stance
 on code breakage, I'd be very surprised if he were in favor of 
 making non-
 nullable references or pointers the default.

We are considering making non-nullables the default, nullable to mark optionally null objects, and enable the related checks with an opt-in compiler flag. Andrei

I am really interested in learning more about how the data you have collected moved you and Walter to make this large change. Especially since it seems that much of the time arguments made for or against either side of the issue seem to be based on anecdotal evidence, or built up based on the small slice of coding that we each deal with. If you get time for an article that would be pretty cool. Joseph
Feb 03 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 4 February 2014 at 06:49:57 UTC, Andrei Alexandrescu 
wrote:
 That would be awesome. The breakage involved, is quite high 
 however.

No breakage if the opt-in flag is not used. Andrei

OK, If you are willing to do that change, I'm 200% behind ! Question, why do you propose to use nullable instead of Nullable!T ?
Feb 04 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Tuesday, 4 February 2014 at 02:27:23 UTC, deadalnix wrote:
 On Tuesday, 4 February 2014 at 01:09:52 UTC, Meta wrote:
 On Monday, 3 February 2014 at 23:34:59 UTC, deadalnix wrote:
 On Monday, 3 February 2014 at 22:23:52 UTC, Meta wrote:
 If null is an invalid value to assign to a pointer, then 
 there's no issue.

 int* foo()
 {
  //Error: cannot implicitly convert typeof(null) to type int*
  return "/etc/foo".exists ? new int : null;
 }

Only cross abstraction boundaries is sufficient.

Can you explain?

void foo() { Widget w = null; // OK w.foo(); // Error w might be null. w = new Widget(); w.foo(); // OK if(condition) { w = null; } w.foo(); // Error if(w !is null) { w.foo(); // OK } bar(w); // Error, w might be null= return w; // Error, w might be null }

Why is `bar(w);` an error? I may be perfectly valid for `bar` to accept null as argument.
Feb 04 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 4 February 2014 at 13:11:49 UTC, Idan Arye wrote:
 Why is `bar(w);` an error? I may be perfectly valid for `bar` 
 to accept null as argument.

It should declare its argument as Nullable!w then.
Feb 04 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Tuesday, 4 February 2014 at 13:20:12 UTC, Dicebot wrote:
 On Tuesday, 4 February 2014 at 13:11:49 UTC, Idan Arye wrote:
 Why is `bar(w);` an error? I may be perfectly valid for `bar` 
 to accept null as argument.

It should declare its argument as Nullable!w then.

If non-null-by-default will be implemented that code will crash on the first line `Widget w = null;`. I was not talking about that - I was talking about the static analysis deadalnix demonstrated in the comments of that code.
Feb 04 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Tuesday, 4 February 2014 at 08:32:26 UTC, deadalnix wrote:
 On Tuesday, 4 February 2014 at 06:49:57 UTC, Andrei 
 Alexandrescu wrote:
 That would be awesome. The breakage involved, is quite high 
 however.

No breakage if the opt-in flag is not used. Andrei

OK, If you are willing to do that change, I'm 200% behind ! Question, why do you propose to use nullable instead of Nullable!T ?

Probably because `Nullable!` suggests that's it's a library solution - and it isn't. In order to implement nullables without core language support for nullables you need to do some workarounds that waste memory and cycles. Also, implementing it as a core language feature makes life easier for the optimizer. At any rate, currently all built-in attributes that use the prefix(all two of them are declaration attributes - but ` nullable` should be a type attribute, so we can send things like ` nullable(int)` as template parameters.
Feb 04 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 4 February 2014 at 14:34:49 UTC, Idan Arye wrote:
 Probably because `Nullable!` suggests that's it's a library 
 solution - and it isn't.

It should be. The way I'd do it is Object o; // not null nullable Object o; // like we have today BUT, user code would never use that. Instead, we'd have: struct Nullable(T) if(__traits(compiles, ( nullable T) {}) { nullable T; } // and a corresponding one so stuff like Nullable!int works This gives us: * Implementation help - no binary cost for Nullable!Object since it just uses null directly instead of a bool isNull field (the optimizer also knows this) * Consistency with all other types. Nullable!int works, Nullable!Object can be passed to a template, inspected, etc. without new traits for isNullable and everything. * Library functionality so we can also make other types that do the same kind of thing Then, if we did the Type? syntax, it would just be rewritten into Nullable!Type. Nullable's definition would probably be in the auto-imported object.d so it always works.
Feb 04 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Tuesday, 4 February 2014 at 14:54:35 UTC, Adam D. Ruppe wrote:
 On Tuesday, 4 February 2014 at 14:34:49 UTC, Idan Arye wrote:
 Probably because `Nullable!` suggests that's it's a library 
 solution - and it isn't.

It should be. The way I'd do it is Object o; // not null nullable Object o; // like we have today BUT, user code would never use that. Instead, we'd have: struct Nullable(T) if(__traits(compiles, ( nullable T) {}) { nullable T; } // and a corresponding one so stuff like Nullable!int works This gives us: * Implementation help - no binary cost for Nullable!Object since it just uses null directly instead of a bool isNull field (the optimizer also knows this) * Consistency with all other types. Nullable!int works, Nullable!Object can be passed to a template, inspected, etc. without new traits for isNullable and everything. * Library functionality so we can also make other types that do the same kind of thing Then, if we did the Type? syntax, it would just be rewritten into Nullable!Type. Nullable's definition would probably be in the auto-imported object.d so it always works.

Sounds awesome.
Feb 04 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Tuesday, 4 February 2014 at 14:54:35 UTC, Adam D. Ruppe wrote:
 On Tuesday, 4 February 2014 at 14:34:49 UTC, Idan Arye wrote:
 Probably because `Nullable!` suggests that's it's a library 
 solution - and it isn't.

It should be. The way I'd do it is Object o; // not null nullable Object o; // like we have today BUT, user code would never use that. Instead, we'd have: struct Nullable(T) if(__traits(compiles, ( nullable T) {}) { nullable T; } // and a corresponding one so stuff like Nullable!int works This gives us: * Implementation help - no binary cost for Nullable!Object since it just uses null directly instead of a bool isNull field (the optimizer also knows this) * Consistency with all other types. Nullable!int works, Nullable!Object can be passed to a template, inspected, etc. without new traits for isNullable and everything. * Library functionality so we can also make other types that do the same kind of thing Then, if we did the Type? syntax, it would just be rewritten into Nullable!Type. Nullable's definition would probably be in the auto-imported object.d so it always works.

I'm interested in how this might fit in with your recent discovery in this thread: http://forum.dlang.org/thread/majnjuhxdefjuqjlpbmv forum.dlang.org?page=1
Feb 04 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 4 February 2014 at 13:44:00 UTC, Idan Arye wrote:
 On Tuesday, 4 February 2014 at 13:20:12 UTC, Dicebot wrote:
 On Tuesday, 4 February 2014 at 13:11:49 UTC, Idan Arye wrote:
 Why is `bar(w);` an error? I may be perfectly valid for `bar` 
 to accept null as argument.

It should declare its argument as Nullable!w then.

If non-null-by-default will be implemented that code will crash on the first line `Widget w = null;`. I was not talking about that - I was talking about the static analysis deadalnix demonstrated in the comments of that code.

Dicebot is right.
Feb 04 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 4 February 2014 at 14:54:35 UTC, Adam D. Ruppe wrote:
 This gives us:

 * Implementation help - no binary cost for Nullable!Object 
 since it just uses null directly instead of a bool isNull field 
 (the optimizer also knows this)

static if(isReferenceType!T) { union { T t; typeof(null) __; } }
 * Consistency with all other types. Nullable!int works, 
 Nullable!Object can be passed to a template, inspected, etc. 
 without new traits for isNullable and everything.

I'm not sure I understand that.
 * Library functionality so we can also make other types that do 
 the same kind of thing

I'm really confused now. What are you defending ??
 Then, if we did the Type? syntax, it would just be rewritten 
 into Nullable!Type. Nullable's definition would probably be in 
 the auto-imported object.d so it always works.

??????
Feb 04 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Tuesday, 4 February 2014 at 14:54:35 UTC, Adam D. Ruppe wrote:
 On Tuesday, 4 February 2014 at 14:34:49 UTC, Idan Arye wrote:
 Probably because `Nullable!` suggests that's it's a library 
 solution - and it isn't.

It should be. The way I'd do it is Object o; // not null nullable Object o; // like we have today BUT, user code would never use that. Instead, we'd have: struct Nullable(T) if(__traits(compiles, ( nullable T) {}) { nullable T; } // and a corresponding one so stuff like Nullable!int works This gives us: * Implementation help - no binary cost for Nullable!Object since it just uses null directly instead of a bool isNull field (the optimizer also knows this) * Consistency with all other types. Nullable!int works, Nullable!Object can be passed to a template, inspected, etc. without new traits for isNullable and everything. * Library functionality so we can also make other types that do the same kind of thing Then, if we did the Type? syntax, it would just be rewritten into Nullable!Type. Nullable's definition would probably be in the auto-imported object.d so it always works.

So what you are saying is that it should be implemented in the core language(` nullable`), and than wrapped in the standard library(`Nullable!`) so we can have the benefit of using D's and Phobos' rich template-handling functionality? Sounds good, but the only problem is that the ` nullable` syntax looks too clean - cleaner than `Nullable!`. That will mean that some people will prefer it - enough to break all the benefits of using the template. I think the core language implementation should look more ugly and cumbersome. How about `__traits(nullable, T)`? People already know that traits are better used via wrappers whenever possible - in contrast to attributes that are meant to be used directly.
Feb 04 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 4 February 2014 at 18:26:10 UTC, deadalnix wrote:
 static if(isReferenceType!T) {
     union {
         T t;
         typeof(null) __;
     }
 }

cool, that would work. nullable is now totally dead to me.
 * Consistency with all other types. Nullable!int works, 
 Nullable!Object can be passed to a template, inspected, etc. 
 without new traits for isNullable and everything.

I'm not sure I understand that.

" nullable int" wouldn't work. A nullable int needs a separate field to store if it has a value or not, since int == 0 is a valid payload. A Nullable!T template can store the separate field if needed (use static if to add the field or use the union with typeof(null)) and thus work for all types with uniform user-side API.
 I'm really confused now. What are you defending ??

Built-in references become not-null by default. Library type Nullable!T is used when yo need null.
Feb 04 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 4 February 2014 at 22:57:06 UTC, Idan Arye wrote:
 So what you are saying is that it should be implemented in the 
 core language(` nullable`), and than wrapped in the standard 
 library(`Nullable!`) so we can have the benefit of using D's 
 and Phobos' rich template-handling functionality?

That was my first thought but deadalnix mentioned a union with typeof(null) which would achieve the same thing. so I think nullable is useless and we should just focus on the library type Nullable!T. There's lastly the question of if(nullable_ref) { // nullable_ref is implicitly converted to not null } and meh, as far as I'm concerned, this is already a solved problem: if(auto not_null = nullable_ref) { // use not_null } But if we wanted the same name's type to magically change, I'd prefer to add some kind of operator overloading to hook that in the library too. Just spitballing syntax but struct Nullable(T) { static if(__traits(potentiallyNullable, T)) { union { T payload; typeof(null) _; } alias payload if; } else { struct { T payload; bool isNull; } T helper(out bool isValid) { isValid = isNull; return payload; } alias helper if; } /* other appropriate overloads/methods */ } The last line is the magic. Unlike the existing technique, opCast(T:bool)() {}, it would allow changing types inside the if. Note that something can be done right now using opCast, a helper type, and alias this: http://arsdnet.net/dcode/notnullsimplified.d see checkNull (thanks to Andrej Mitrovic for showing this to me) So I'm generally meh on it, but if we do want to do magic changing types, I definitely want it available in the library for more than just nullability.
Feb 04 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 4 February 2014 at 16:00:20 UTC, Meta wrote:
 I'm interested in how this might fit in with your recent 
 discovery in this thread:

 http://forum.dlang.org/thread/majnjuhxdefjuqjlpbmv forum.dlang.org?page=1

There's a lot of potential for pluggable semantic checks with that. Earlier today, I toyed with it for checking virtual functions: import core.config; class Foo { virtual void virt() {} /// annotated so OK void oops() {} // not annotated, yet virtual, we want a warning } module core.config; string[] types; template TypeCheck(T) { static if(is(T == class)) enum lol = virtualCheck!T(); } enum virtual; bool virtualCheck(T)() { foreach(member; __traits(derivedMembers, T)) { static if(__traits(isVirtualFunction, __traits(getMember, T, member))) { static if(__traits(getAttributes, __traits(getMember, T, member)). length == 0) pragma(msg, "Warning " ~ T.stringof ~ "." ~ member ~ " is virtual"); } } return true; } Warning: Foo.oops is virtual A similar thing could be set up for nullable, or maybe even borrowed (though it would fall short there without the help of the scope storage class on parameters and return values). It'd scan all the types and if not explicitly marked nullable or NotNull, it can throw a static assert or pragma(msg). But RtInfo can't see local variables nor module-level free functions, so it can't do a complete check. I'd love a "RTInfo for modules" (enhancement in bugzilla already) and a way to inspect local varaibles with __traits somehow too... Anyway, bottom line is rtinfo can help add some of these semantics but wouldn't go all the way. Going all the way will break code, but I think it will be worth it. Breakage in general is wrong, but when it helps find latent bugs and is relatively easy to deal with (adding Nullable! or GC! in appropriate places) the benefits may be worth the cost. I believe that's the case here.
Feb 04 2014
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Wednesday, 5 February 2014 at 03:12:57 UTC, Adam D. Ruppe 
wrote:
 On Tuesday, 4 February 2014 at 22:57:06 UTC, Idan Arye wrote:
 So what you are saying is that it should be implemented in the 
 core language(` nullable`), and than wrapped in the standard 
 library(`Nullable!`) so we can have the benefit of using D's 
 and Phobos' rich template-handling functionality?

That was my first thought but deadalnix mentioned a union with typeof(null) which would achieve the same thing. so I think nullable is useless and we should just focus on the library type Nullable!T. There's lastly the question of if(nullable_ref) { // nullable_ref is implicitly converted to not null } and meh, as far as I'm concerned, this is already a solved problem: if(auto not_null = nullable_ref) { // use not_null } But if we wanted the same name's type to magically change, I'd prefer to add some kind of operator overloading to hook that in the library too. Just spitballing syntax but struct Nullable(T) { static if(__traits(potentiallyNullable, T)) { union { T payload; typeof(null) _; } alias payload if; } else { struct { T payload; bool isNull; } T helper(out bool isValid) { isValid = isNull; return payload; } alias helper if; } /* other appropriate overloads/methods */ } The last line is the magic. Unlike the existing technique, opCast(T:bool)() {}, it would allow changing types inside the if. Note that something can be done right now using opCast, a helper type, and alias this: http://arsdnet.net/dcode/notnullsimplified.d see checkNull (thanks to Andrej Mitrovic for showing this to me) So I'm generally meh on it, but if we do want to do magic changing types, I definitely want it available in the library for more than just nullability.

The question is - will the optimizer be able to handle this as good as it could handle a built-in ` nullable`? If so - I'm all for it, but if not - D is still a system language and performance still counts. Anyways, the order in the union should be reversed: union { typeof(null) _; T payload; } The first field of the union is the one who gets initialized. If `T` is the first union "field", depending on how non-null-by-default will be implemented either an object of type `T` will be implicitly constructed or compilation will break when a nullable is created without initializing a value. If `typeof(null)` is the first union "field" it'll be initialized to `null` - as everyone should expect.
Feb 05 2014
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, February 03, 2014 22:03:13 Andrei Alexandrescu wrote:
 On 2/3/14, 9:09 PM, Jonathan M Davis wrote:
 I truly hope that that's never the case. Adding non-nullable references to
 the language is one thing; making them the default is quite another, and
 making them the default would break existing code. And given Walter's
 normal stance on code breakage, I'd be very surprised if he were in favor
 of making non- nullable references or pointers the default.

We are considering making non-nullables the default, nullable to mark optionally null objects, and enable the related checks with an opt-in compiler flag.

So, the default is going to be non-nullable except that it isn't really non- nullable unless you use a flag? That sounds like -property all over again. It could certainly be used to smooth out the transition, but we'd have to be way more on top of it than we've been with -property and actually get all of the checks ironed out and make it the default within a reasonably short period of time rather than years of it going almost nowhere. And if the suggestion is that the flag be around permanently, well, I don't see that as turning out any better than it has with -property (which has essentially been permanent in that it's functionality has never been fully integrated into the language nor has it gone away with its functionality not being integrated). That would essentially be forking the language. So, I can only assume that it's not the intention that the flag be permanent but rather that it's intended that it eventually become the default behavior. That can be made to work, but we don't have a good track record on this sort of thing. Also, completely aside from whether it's the default, how are we going to deal with all of the many cases than require an init value? Does it effectively become the class equivalent of disable this()? While disable this() might be a useful feature under certain circumstances, it has a lot of nasty repercussions - particularly with generic code - and IMHO one of the biggest warts in the language. It may be a necessary one, but it's still a wart, and one that definitely causes actual problems rather than being a matter of aesthetics. So, making non-nullable references the default would be akin to making disable this() the default for structs. It's not quite as bad given that structs are probably more commonly used than classes in D, but it looks to me like it's essentially the same thing. And that seems like a really bad idea, just like making disabling structs' init value by default would be a bad idea. And are you thinking of this for references only, or are pointers going to become non-nullable by default as well? That could cause some grief with extern(C) functions, given that they're going to need to take nullable pointers, and it could make interacting with AAs more annoying given that their in operator returns a nullable pointer and that that pointer must be nullable for it do it's job. This whole thing seems like a really bad idea to me. Having non-nullable pointers or references can certainly be useful, but it doesn't fit well with D's design with regards to init values, and making them the default seems like a disaster to me. And if we were going to start changing defaults, I would have thought that making pure, safe, and/or nothrow the default would be a lot more beneficial given how they have to be used absolutely everywhere if you want to do anything with them. - Jonathan M Davis
Feb 05 2014
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 We are considering making non-nullables the default,  nullable 
 to mark optionally null objects, and enable the related checks 
 with an opt-in compiler flag.

It's a significant transition, very interesting. I think it could be for the better (I asked something related time ago), but it must be designed and handled very well, or no at all. After such non-nullable transition there are probably only one or two significant things that may be worth adding/changing in D in the medium term, and one of them is the implementation of the "scope" semantics (and something very related, a bit better ownership). Bye, bearophile
Feb 05 2014