www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - null dereference exception vs. segfault?

reply Ryan W Sims <rwsims gmail.com> writes:
The following code fails with a "Bus error" (OSX speak for "Segfault," 
if I understand correctly).

// types.d
import std.stdio;

class A {
     int x = 42;
}

void fail_sometimes(int n) {
     A a;
     if (n == 0) {
         a = new A;  // clearly a contrived example
     }
     assert(a.x == 42, "Wrong x value");
}

void main() {
     fail_sometimes(1);
}

It's even worse if I do a 'dmd -run types.d', it just fails without even 
the minimalistic "Bus error." Is this correct behavior? I searched the 
archives & looked at the FAQ & found workarounds (registering a signal 
handler), but not a justification, and the threads were from a couple 
years ago. Wondering if maybe something has changed and there's a 
problem with my system?

--
rwsims
Aug 01 2010
next sibling parent Jeffrey Yasskin <jyasskin gmail.com> writes:
Even better, you can annotate fail_sometimes with  safe, and it'll
still access out-of-bounds memory.

Take the following with a grain of salt since I'm really new to the languag=
e.

gdb says:
Reason: KERN_PROTECTION_FAILURE at address: 0x00000008
0x00001e52 in D4test14fail_sometimesFiZv ()

which indicates that 'a' is getting initialized to null (possibly by
process startup 0ing out the stack), and then x is being read out of
it. You can get exactly the same crashes in C++ by reading member
variables out of null pointers. The D compiler is supposed to catch
the uninitialized variable ("It is an error to use a local variable
without first assigning it a value." in
http://www.digitalmars.com/d/2.0/function.html), but clearly it's
missing this one.

I haven't actually found where in the language spec it says that class
variables are pointers, or what their default values are. I'd expect
to find this in http://www.digitalmars.com/d/2.0/type.html, but no
luck.

Looking through the bug tracker ... Walter's response to
http://d.puremagic.com/issues/show_bug.cgi?id=3D671 seems to indicate
that he isn't serious about uninitialized use being an error. It's
just undefined behavior like in C++.

In any case, the fix for your problem will be to initialize 'a' before usin=
g it.

On Sun, Aug 1, 2010 at 9:59 PM, Ryan W Sims <rwsims gmail.com> wrote:
 The following code fails with a "Bus error" (OSX speak for "Segfault," if=

 understand correctly).

 // types.d
 import std.stdio;

 class A {
 =C2=A0 =C2=A0int x =3D 42;
 }

 void fail_sometimes(int n) {
 =C2=A0 =C2=A0A a;
 =C2=A0 =C2=A0if (n =3D=3D 0) {
 =C2=A0 =C2=A0 =C2=A0 =C2=A0a =3D new A; =C2=A0// clearly a contrived exam=

 =C2=A0 =C2=A0}
 =C2=A0 =C2=A0assert(a.x =3D=3D 42, "Wrong x value");
 }

 void main() {
 =C2=A0 =C2=A0fail_sometimes(1);
 }

 It's even worse if I do a 'dmd -run types.d', it just fails without even =

 minimalistic "Bus error." Is this correct behavior? I searched the archiv=

 & looked at the FAQ & found workarounds (registering a signal handler), b=

 not a justification, and the threads were from a couple years ago. Wonder=

 if maybe something has changed and there's a problem with my system?

 --
 rwsims

Aug 02 2010
prev sibling next sibling parent reply Jonathan M Davis <jmdavisprog gmail.com> writes:
On Monday 02 August 2010 00:05:40 Jeffrey Yasskin wrote:
 Even better, you can annotate fail_sometimes with  safe, and it'll
 still access out-of-bounds memory.
 
 Take the following with a grain of salt since I'm really new to the
 language.
 
 gdb says:
 Reason: KERN_PROTECTION_FAILURE at address: 0x00000008
 0x00001e52 in D4test14fail_sometimesFiZv ()
 
 which indicates that 'a' is getting initialized to null (possibly by
 process startup 0ing out the stack), and then x is being read out of
 it. You can get exactly the same crashes in C++ by reading member
 variables out of null pointers. The D compiler is supposed to catch
 the uninitialized variable ("It is an error to use a local variable
 without first assigning it a value." in
 http://www.digitalmars.com/d/2.0/function.html), but clearly it's
 missing this one.
 
 I haven't actually found where in the language spec it says that class
 variables are pointers, or what their default values are. I'd expect
 to find this in http://www.digitalmars.com/d/2.0/type.html, but no
 luck.
 
 Looking through the bug tracker ... Walter's response to
 http://d.puremagic.com/issues/show_bug.cgi?id=671 seems to indicate
 that he isn't serious about uninitialized use being an error. It's
 just undefined behavior like in C++.
 
 In any case, the fix for your problem will be to initialize 'a' before
 using it.

_All_ variables in D are initialized with a default value. There should be _no_ undefined behavior with regards to initializations. D is very concientious about avoiding undefined behavior. In the case of references and pointers, they are initialized to null. There's not really such a thing as using a variable without initializing it, because variables are default initialized if you don't initialize them yourself. The _one_ exception would be if you explicitly initialized a variable to void: int[] a = void; In that case, you are _explicitly_ telling the compiler not to default initialize the variable. That _can_ lead to undefined behavior and is definitely unsafe. As such, it is intended solely for the purposes of optimizing code where absolutely necessary. So, you really shouldn't have any variables in your code that weren't initialized, even if you didn't initialize them explicitly. The pages that you're looking at there need to be updated for clarity. - Jonathan M Davis
Aug 02 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:
 _All_ variables in D are initialized with a default value. There should be
_no_ 
 undefined behavior with regards to initializations. D is very concientious
about 
 avoiding undefined behavior.

See also: http://d.puremagic.com/issues/show_bug.cgi?id=3820 Bye, bearophile
Aug 02 2010
prev sibling next sibling parent reply Jonathan M Davis <jmdavisprog gmail.com> writes:
On Sunday 01 August 2010 21:59:42 Ryan W Sims wrote:
 The following code fails with a "Bus error" (OSX speak for "Segfault,"
 if I understand correctly).
 
 // types.d
 import std.stdio;
 
 class A {
      int x = 42;
 }
 
 void fail_sometimes(int n) {
      A a;
      if (n == 0) {
          a = new A;  // clearly a contrived example
      }
      assert(a.x == 42, "Wrong x value");
 }
 
 void main() {
      fail_sometimes(1);
 }
 
 It's even worse if I do a 'dmd -run types.d', it just fails without even
 the minimalistic "Bus error." Is this correct behavior? I searched the
 archives & looked at the FAQ & found workarounds (registering a signal
 handler), but not a justification, and the threads were from a couple
 years ago. Wondering if maybe something has changed and there's a
 problem with my system?
 
 --
 rwsims

You are getting a segmentation fault because you are dereferencing a null reference. All references are default initialized to null. So, if you fail to explicitly initialize them or to assign to them, then they stay null, and in such a case, you will get a segfault if you try to dereference them. If you changed your code to import std.stdio; class A { int x = 42; } void fail_sometimes(int n) { A a; if (n == 0) { a = new A; // clearly a contrived example } assert(a !is null, "a shouldn't be null"); assert(a.x == 42, "Wrong x value"); } void main() { fail_sometimes(1); } you would get output something like this core.exception.AssertError types.d(12): a shouldn't be null ---------------- ./types() [0x804b888] ./types() [0x8049360] ./types() [0x8049399] ./types() [0x804ba54] ./types() [0x804b9b9] ./types() [0x804ba91] ./types() [0x804b9b9] ./types() [0x804b968] /opt/lib32/lib/libc.so.6(__libc_start_main+0xe6) [0xf760bc76] ./types() [0x8049261] Unlike Java, there is no such thing as a NullPointerException in D. You just get segfaults - just like you would in C++. So, if you don't want segfaults from derefencing null references, you need to make sure that they aren't null when you dereference them. - Jonathan M Davis
Aug 02 2010
parent reply Ryan W Sims <rwsims gmail.com> writes:
On 8/2/10 1:56 AM, Jonathan M Davis wrote:
 On Sunday 01 August 2010 21:59:42 Ryan W Sims wrote:
 The following code fails with a "Bus error" (OSX speak for "Segfault,"
 if I understand correctly).

 // types.d
 import std.stdio;

 class A {
       int x = 42;
 }

 void fail_sometimes(int n) {
       A a;
       if (n == 0) {
           a = new A;  // clearly a contrived example
       }
       assert(a.x == 42, "Wrong x value");
 }

 void main() {
       fail_sometimes(1);
 }

 It's even worse if I do a 'dmd -run types.d', it just fails without even
 the minimalistic "Bus error." Is this correct behavior? I searched the
 archives&  looked at the FAQ&  found workarounds (registering a signal
 handler), but not a justification, and the threads were from a couple
 years ago. Wondering if maybe something has changed and there's a
 problem with my system?

 --
 rwsims

You are getting a segmentation fault because you are dereferencing a null reference. All references are default initialized to null. So, if you fail to explicitly initialize them or to assign to them, then they stay null, and in such a case, you will get a segfault if you try to dereference them.

Yes, I know *why* I'm getting a segfault, thank you - I set up the example explicitly to defeat the compiler's null checking to test the behavior. I was startled that there wasn't an exception thrown w/ a stack trace. [snip]
 Unlike Java, there is no such thing as a NullPointerException in D. You just
get
 segfaults - just like you would in C++. So, if you don't want segfaults from
 derefencing null references, you need to make sure that they aren't null when
 you dereference them.

 - Jonathan M Davis

That was my question, thanks. It seemed like such an un-D thing to have happen; I was surprised. I guess w/o the backing of a full virtual machine, it's tricker to catch null dereferences on the fly, but boy it'd be nice to have. Don't want to re-fire the debate here, though. -- rwsims
Aug 02 2010
parent reply Mafi <mafi example.org> writes:
Am 02.08.2010 16:50, schrieb Ryan W Sims:
 On 8/2/10 1:56 AM, Jonathan M Davis wrote:
 On Sunday 01 August 2010 21:59:42 Ryan W Sims wrote:
 The following code fails with a "Bus error" (OSX speak for "Segfault,"
 if I understand correctly).

 // types.d
 import std.stdio;

 class A {
 int x = 42;
 }

 void fail_sometimes(int n) {
 A a;
 if (n == 0) {
 a = new A; // clearly a contrived example
 }
 assert(a.x == 42, "Wrong x value");
 }

 void main() {
 fail_sometimes(1);
 }

 It's even worse if I do a 'dmd -run types.d', it just fails without even
 the minimalistic "Bus error." Is this correct behavior? I searched the
 archives& looked at the FAQ& found workarounds (registering a signal
 handler), but not a justification, and the threads were from a couple
 years ago. Wondering if maybe something has changed and there's a
 problem with my system?

 --
 rwsims

You are getting a segmentation fault because you are dereferencing a null reference. All references are default initialized to null. So, if you fail to explicitly initialize them or to assign to them, then they stay null, and in such a case, you will get a segfault if you try to dereference them.

Yes, I know *why* I'm getting a segfault, thank you - I set up the example explicitly to defeat the compiler's null checking to test the behavior. I was startled that there wasn't an exception thrown w/ a stack trace. [snip]
 Unlike Java, there is no such thing as a NullPointerException in D.
 You just get
 segfaults - just like you would in C++. So, if you don't want
 segfaults from
 derefencing null references, you need to make sure that they aren't
 null when
 you dereference them.

 - Jonathan M Davis

That was my question, thanks. It seemed like such an un-D thing to have happen; I was surprised. I guess w/o the backing of a full virtual machine, it's tricker to catch null dereferences on the fly, but boy it'd be nice to have. Don't want to re-fire the debate here, though. -- rwsims

If you want a NullPointerException as part of your program flow, you can use enforce() (in std.contracts I think). I don't think catching a NullPointerException in a big code block where you don't know which dereferencing should fail is good style. Mafi
Aug 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Mafi:
 If you want a NullPointerException as part of your program flow, you can 
 use enforce() (in std.contracts I think). I don't think catching a 
 NullPointerException in a big code block where you don't know which 
 dereferencing should fail is good style.

enforce() is not a panacea (panchrest); as far as I know DMD doesn't inline any function that contains enforce(). So sometimes an assert() is better, especially if it's inside a contract (precondition, etc). DesignByConstrac-style programming is not something that just happens, you have to train yourself for some time for it. Bye, bearophile
Aug 02 2010
parent reply Ryan W Sims <rwsims gmail.com> writes:
On 8/2/10 10:33 AM, bearophile wrote:
 Mafi:
 If you want a NullPointerException as part of your program flow, you can
 use enforce() (in std.contracts I think). I don't think catching a
 NullPointerException in a big code block where you don't know which
 dereferencing should fail is good style.

enforce() is not a panacea (panchrest); as far as I know DMD doesn't inline any function that contains enforce(). So sometimes an assert() is better, especially if it's inside a contract (precondition, etc). DesignByConstrac-style programming is not something that just happens, you have to train yourself for some time for it. Bye, bearophile

The problem isn't how to check it on a case-by-case basis, there are plenty of ways to check that a given pointer is non-null. The problem is debugging _unexpected_ null dereferences, for which a NPE or its equivalent is very helpful, a segfault is _not_. Sorry, didn't mean to reopen a can of worms, just wanted to be clear. -- rwsims
Aug 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Ryan W Sims:
 The problem isn't how to check it on a case-by-case basis, there are 
 plenty of ways to check that a given pointer is non-null. The problem is 
 debugging _unexpected_ null dereferences, for which a NPE or its 
 equivalent is very helpful, a segfault is _not_.

I don't know what NPE is, but if you program with DbC your nulls are very often found out by asserts, so you have assert errors (that show line number & file name) instead of segfaults.
 Sorry, didn't mean to reopen a can of worms, just wanted to be clear.

When people that discuss are polite there is no problem in reopening the can now and then :-) Bye, bearophile
Aug 02 2010
parent reply Pelle <pelle.mansson gmail.com> writes:
On 08/02/2010 11:27 PM, bearophile wrote:
 Ryan W Sims:
 The problem isn't how to check it on a case-by-case basis, there are
 plenty of ways to check that a given pointer is non-null. The problem is
 debugging _unexpected_ null dereferences, for which a NPE or its
 equivalent is very helpful, a segfault is _not_.

I don't know what NPE is, but if you program with DbC your nulls are very often found out by asserts, so you have assert errors (that show line number& file name) instead of segfaults.

Null Pointer Exception! However, I agree with getting segfaults from them. Otherwise, you will be tempted to use the exception handling mechanisms to catch null pointer exceptions, which is a bad thing. I also agree with the notion of using DbC to find nulls. What I really wish for is non-nullable types, though. Maybe in D3... :P
Aug 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Pelle:

 Null Pointer Exception!

Ah, I see. I hate TLA (Three Letter Acronyms).
 What I really wish for is non-nullable types, though. Maybe in D3... :P

I think there is no enhancement request in Bugzilla about this, I will add one. To implement this you have to think about the partially uninitialized objects too, this is a paper about it, given a class type T it defines four types (I think the four types are managed by the compiler only, the programmer uses only two of them, nullable class references and nonnullable ones): http://research.microsoft.com/pubs/67461/non-null.pdf If a language defaults to nonnullable references, then you can use this syntax: class T {} T nonnullable_instance = new T; T? nullable_instance; But now it's probably nearly impossible to make D references nonnullable on default, so that syntax can't be used. And I don't what syntax to use yet. Suggestions welcome. Bye, bearophile
Aug 02 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
 But now it's probably nearly impossible to make D references nonnullable on
default, so that syntax can't be used. And I don't what syntax to use yet.
Suggestions welcome.

One of the few ideas I have had is to use the suffix for this: class T {} T nullable_reference; T nonnullable_reference = new T (); struct S {} S nullable_pointer; S nonnullable_pointer = new S (); (Beside nonnullable class references/pointers, another way to catch bugs that I miss in D are the ranged integers of ObjectPascal/Ada. Walter doesn't like them, I think he thinks they are a failed idea, but I don't agree and I don't remember why he thinks so.) Bye, bearophile
Aug 02 2010
prev sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 08/03/2010 12:02 AM, bearophile wrote:
 Pelle:
 What I really wish for is non-nullable types, though. Maybe in D3... :P

I think there is no enhancement request in Bugzilla about this, I will add one.

I think there has been, at least this has been discussed on the newsgroup.
 To implement this you have to think about the partially uninitialized objects
too, this is a paper about it, given a class type T it defines four types (I
think the four types are managed by the compiler only, the programmer uses only
two of them, nullable class references and nonnullable ones):
 http://research.microsoft.com/pubs/67461/non-null.pdf

 If a language defaults to nonnullable references, then you can use this syntax:

 class T {}
 T nonnullable_instance = new T;
 T? nullable_instance;

 But now it's probably nearly impossible to make D references nonnullable on
default, so that syntax can't be used. And I don't what syntax to use yet.
Suggestions welcome.

 Bye,
 bearophile

That is a good syntax indeed. What is also needed is a way of conditionally getting the reference out of the nullable. I think delight uses something like this: T? nullable; if actual = nullable: actual.dostuff; I think a good thing would be NonNull!T, but I haven't managed to create one. If this structure exists and becomes good practice to use, maybe we can get the good syntax in D3. In 20 years or so :P
Aug 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Pelle:
 I think a good thing would be NonNull!T, but I haven't managed to create 
 one. If this structure exists and becomes good practice to use, maybe we 
 can get the good syntax in D3. In 20 years or so :P

Maybe we are talking about two different things, I was talking about nonnull class references/pointers, you seem to talk about nullable values :-) Both can be useful in D, but they are different things. Nullable values are simpler to design, they are just wrapper structs that contain a value plus a boolean, plus if you want some syntax sugar to manage them with a shorter syntax. Bye, bearophile
Aug 02 2010
parent reply Pelle <pelle.mansson gmail.com> writes:
On 08/03/2010 12:32 AM, bearophile wrote:
 Pelle:
 I think a good thing would be NonNull!T, but I haven't managed to create
 one. If this structure exists and becomes good practice to use, maybe we
 can get the good syntax in D3. In 20 years or so :P

Maybe we are talking about two different things, I was talking about nonnull class references/pointers, you seem to talk about nullable values :-) Both can be useful in D, but they are different things. Nullable values are simpler to design, they are just wrapper structs that contain a value plus a boolean, plus if you want some syntax sugar to manage them with a shorter syntax. Bye, bearophile

I am talking about non-nullable references indeed. I don't think I mentioned nullable types, really. I also created this, as the simplest NotNull-type concievable: struct NotNull(T) if(is(typeof(T.init !is null))) { private T _instance; this(T t) { enforce(t !is null, "Cannot create NotNull from null"); _instance = t; } T get() { assert (_instance !is null, text("Supposed NotNull!(", T.stringof, ") is null")); return _instance; } alias get this; } This has the obvious bug in that you can declare a nonnull without an initializer and get a null from it. If we ever get disable this(){} for structs, this struct can become better. I'll probably try it out in some code.
Aug 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Pelle:

 struct NotNull(T) if(is(typeof(T.init !is null))) {

Is this enough? struct NotNull(T) if (is(T.init is null)) {
      this(T t) {
          enforce(t !is null, "Cannot create NotNull from null");

enforce() is bad, use Design by Contract instead (a precondition with an assert inside). Bye, bearophile
Aug 02 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
 Is this enough?
 struct NotNull(T) if (is(T.init is null)) {

Sorry, I meant: struct NotNull(T) if (T.init is null) { Bye, bearophile
Aug 02 2010
prev sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 08/03/2010 01:08 AM, bearophile wrote:
 Pelle:

 struct NotNull(T) if(is(typeof(T.init !is null))) {

Is this enough? struct NotNull(T) if (is(T.init is null)) {
       this(T t) {
           enforce(t !is null, "Cannot create NotNull from null");

enforce() is bad, use Design by Contract instead (a precondition with an assert inside). Bye, bearophile

If NotNull will be in a library, it should probably use enforce, if I have understood things correctly. External input, and all that. I think most of phobos does it like this currently.
Aug 02 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Pelle:
 If NotNull will be in a library, it should probably use enforce, if I 
 have understood things correctly. External input, and all that. I think 
 most of phobos does it like this currently.

I suspect that Andrei has still to "get" DbC :-) (And your lib is not Phobos.) Bye, bearophile
Aug 02 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 02 Aug 2010 00:59:42 -0400, Ryan W Sims <rwsims gmail.com> wrote:

 The following code fails with a "Bus error" (OSX speak for "Segfault,"  
 if I understand correctly).

 // types.d
 import std.stdio;

 class A {
      int x = 42;
 }

 void fail_sometimes(int n) {
      A a;
      if (n == 0) {
          a = new A;  // clearly a contrived example
      }
      assert(a.x == 42, "Wrong x value");
 }

 void main() {
      fail_sometimes(1);
 }

 It's even worse if I do a 'dmd -run types.d', it just fails without even  
 the minimalistic "Bus error." Is this correct behavior? I searched the  
 archives & looked at the FAQ & found workarounds (registering a signal  
 handler), but not a justification, and the threads were from a couple  
 years ago. Wondering if maybe something has changed and there's a  
 problem with my system?

I'm not familiar with dmd -run, but you should be aware that asserts are not compiled into release code. Try changing the assert to this: if(a.x != 42) writeln("Wrong x value"); FWIW, D does not have null pointer exceptions, even in debug mode. It's an oft-debated subject, but Walter hasn't ever budged on it. His view is that you should use a debugger to see where your code is failing. We have pointed out countless times that often it's not possible to have a debugger at hand, or even be able to reproduce the issue that caused the segfault while in a different environment. I don't know if we'll ever see null pointer exceptions, but I'd love them in debug mode only, or at least to see a stack trace when it occurs. The latter can be done without Phobos/dmd help if someone can write such a signal handler function. I don't know enough about stack traces to understand how to do it. -Steve
Aug 02 2010
prev sibling next sibling parent reply Jeffrey Yasskin <jyasskin gmail.com> writes:
On Mon, Aug 2, 2010 at 1:49 AM, Jonathan M Davis <jmdavisprog gmail.com> wrote:
 On Monday 02 August 2010 00:05:40 Jeffrey Yasskin wrote:
 Even better, you can annotate fail_sometimes with  safe, and it'll
 still access out-of-bounds memory.

 Take the following with a grain of salt since I'm really new to the
 language.

 gdb says:
 Reason: KERN_PROTECTION_FAILURE at address: 0x00000008
 0x00001e52 in D4test14fail_sometimesFiZv ()

 which indicates that 'a' is getting initialized to null (possibly by
 process startup 0ing out the stack), and then x is being read out of
 it. You can get exactly the same crashes in C++ by reading member
 variables out of null pointers. The D compiler is supposed to catch
 the uninitialized variable ("It is an error to use a local variable
 without first assigning it a value." in
 http://www.digitalmars.com/d/2.0/function.html), but clearly it's
 missing this one.

 I haven't actually found where in the language spec it says that class
 variables are pointers, or what their default values are. I'd expect
 to find this in http://www.digitalmars.com/d/2.0/type.html, but no
 luck.

 Looking through the bug tracker ... Walter's response to
 http://d.puremagic.com/issues/show_bug.cgi?id=671 seems to indicate
 that he isn't serious about uninitialized use being an error. It's
 just undefined behavior like in C++.

 In any case, the fix for your problem will be to initialize 'a' before
 using it.

_All_ variables in D are initialized with a default value. There should be _no_ undefined behavior with regards to initializations. D is very concientious about avoiding undefined behavior. In the case of references and pointers, they are initialized to null.

That's good to know. Unfortunately, reading through a null pointer does cause undefined behavior: it's not a guaranteed segfault. Consider an object with a large array at the beginning, which pushes later members past the empty pages at the beginning of the address space. I don't suppose the D compiler watches for such large objects and emits actual null checks before indexing into them?
 The pages that you're looking at there need to be updated for clarity.

Nice use of the passive voice. Who needs to update them? Is their source somewhere you or I could send a patch?
Aug 02 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Jeffrey Yasskin:
 That's good to know. Unfortunately, reading through a null pointer
 does cause undefined behavior: it's not a guaranteed segfault.
 Consider an object with a large array at the beginning, which pushes
 later members past the empty pages at the beginning of the address
 space. I don't suppose the D compiler watches for such large objects
 and emits actual null checks before indexing into them?

I am not expert enough to give you a good answer about this, but do some tests :-) And later if you want you may say the same things in the main D newsgroup. Bye, bearophile
Aug 02 2010
prev sibling parent reply Jonathan M Davis <jmdavisprog gmail.com> writes:
On Monday, August 02, 2010 08:34:50 Jeffrey Yasskin wrote:
 That's good to know. Unfortunately, reading through a null pointer
 does cause undefined behavior: it's not a guaranteed segfault.
 Consider an object with a large array at the beginning, which pushes
 later members past the empty pages at the beginning of the address
 space. I don't suppose the D compiler watches for such large objects
 and emits actual null checks before indexing into them?

There are no null checks. When people have requested in the past that null checks be added (like you'd get in Java), Walter has indicated that he thought that there was no point to them because the OS takes care of them already by giving you a segfault. I'm not personally well-versed enough in exactly what goes on at the hardware or OS level to produce a segfault, so I can't say whether a segfault is absolutely guaranteed. It has been my understanding that it is. As for indexing into an array, the array itself should be null or not. It has no size if it's null, so it makes no sense to talk about large arrays which are null. On top of that, bounds checking is usually done on arrays (off of the top of my head, I don't remember the exact circumstances under which it's removed, but it's almost always there), so you wouldn't be able to index past its end, and if it's an element of the array that you're dereferencing, then whether that element is null or not will determine whether it segfaults.
 The pages that you're looking at there need to be updated for clarity.

Nice use of the passive voice. Who needs to update them? Is their source somewhere you or I could send a patch?

Submit a bug report to bugzilla: http://d.puremagic.com/issues/ - Jonathan M Davis
Aug 02 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:
 As for indexing into an array, the array itself should be null or not. It has
no 
 size if it's null, so it makes no sense to talk about large arrays which are 
 null.

Technically dynamic arrays in D are represented with a 2-word struct that contains a pointer and length. So empty dynamic arrays are two zero words. In D there is also the literal [] that in my opinion is better to represent an empty array than just "null": http://d.puremagic.com/issues/show_bug.cgi?id=3889 Bye, bearophile
Aug 02 2010