digitalmars.D.learn - Why does nobody seem to think that `null` is a serious problem in D?

Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso (14/14) Nov 19 2018 When I was first playing with D, I managed to create a segfault

Adam D. Ruppe (7/8) Nov 19 2018 The mistake is immediately obvious when you run the program, so I

aberba (3/11) Nov 20 2018 Does D have a linter which warns about certain style of coding

Kagamin (2/4) Nov 20 2018 AFAIK, dscanner does some linting.
Adam D. Ruppe (2/4) Nov 20 2018 dscanner might check it. I don't know though.

aliak (5/11) Nov 20 2018 This only applies to little scripts and unittests maybe.

Kagamin (3/7) Nov 20 2018 Umm... if you write a larger application not knowing what is a

Neia Neutuladh (10/19) Nov 20 2018 A pointer to a struct is a reference type.
Aliak (3/11) Nov 20 2018 I’m not sure I understood your point? I was saying that you don’t

Chris Katko (1/7) Nov 20 2018 My favorite D'ism so far:

NoMoreBugs (6/14) Nov 21 2018 How hard would it be, really, for the compiler to determine that

Alex (35/52) Nov 21 2018 Am I misled, or isn't this impossible by design?

Kagamin (3/37) Nov 21 2018 A value passed to ref parameter is assumed to be initialized. C#

Alex (38/40) Nov 21 2018 This was not my point. I wonder, whether the case, where the

Neia Neutuladh (3/4) Nov 21 2018 C# *would* reject that (you can't call any methods on a null object), bu...

Alex (57/61) Nov 21 2018 No, it wouldn't. And it doesn't.

aliak (15/22) Nov 21 2018 The compiler would not be able to prove that something was

Alex (19/44) Nov 21 2018 Nice! Didn't know that... But the language is a foreign one for

aliak (13/31) Nov 22 2018 This is true. But then the difference is that you can't* call a

Kagamin (3/40) Nov 22 2018 As `c` is initialized in both branches, compiler knows it's

Stefan Koch (9/26) Nov 21 2018 For _TRIVIAL_cases this is not hard.

NoMoreBugs (7/15) Nov 21 2018 On the face of it, that seems a reasonable argument. i.e.

Steven Schveighoffer (12/28) Nov 19 2018 A null pointer dereference is an immediate error, and it's also a safe

Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso (7/11) Nov 19 2018 Is this always true for all arches that D can compile to? I

Steven Schveighoffer (19/32) Nov 19 2018 It's true for all OSes that D supports, and for most modern operating

Jonathan M Davis (19/49) Nov 19 2018 For @safe to function properly, dereferencing null _must_ be guaranteed ...

Johan Engelen (42/58) Nov 20 2018 One big problem is the way people talk and write about this

Steven Schveighoffer (33/93) Nov 20 2018 In general, I always consider "dereferencing" the point at which code

Johan Engelen (41/49) Nov 20 2018 But `a.foo()` is already using the object's data: it is accessing

Neia Neutuladh (21/25) Nov 20 2018 That's what we have today:

Johan Engelen (9/11) Nov 21 2018 "have to do a dereference" in terms of "dereference" as language

Patrick Schluter (20/72) Nov 21 2018 It matters a lot. A virtual function is a pointer that is in the

Johan Engelen (14/46) Nov 21 2018 This and the rest of your email is exactly the kind of thinking

Steven Schveighoffer (25/68) Nov 22 2018 I get what you are saying. But in terms of memory safety *both results*

Timon Gehr (6/19) Dec 03 2018 This is called nondeterministic semantics, and it is a good idea if you

NoMoreBugs (11/20) Nov 20 2018 This is like me saying I won't bother locking up when I leave the

Jonathan M Davis (44/101) Nov 20 2018 Yeah. It's one of those areas where the spec will need to be clear. Like

Johan Engelen (14/25) Nov 21 2018 The issue is not specific to LDC at all. DMD also does

Kagamin (4/6) Nov 22 2018 Do you have an example? I think it treats null dereference as
Jonathan M Davis (52/76) Nov 22 2018 Skipping a dereference of null shouldn't be a problem as far as memory

Tony (5/5) Nov 30 2018 isocpp.org just had a link to a blog post where someone makes a

Dukc (9/14) Dec 01 2018 This is grat when it works, but the problem is that it would be

Tony (4/12) Dec 01 2018 I think he is just talking about the compiler or static analyzer

H. S. Teoh (33/46) Dec 01 2018 But that's precisely the problem. It's not always possible to tell

Sebastiaan Koppe (10/23) Dec 01 2018 If you are willing to loose some precision you can still analyse

Sebastiaan Koppe (3/5) Dec 01 2018 *can't*

aliak (8/24) Dec 02 2018 If a compiler were to issue warnings/error for uninitialized
Tony (13/22) Dec 02 2018 To me, the possibility of a "false positive" doesn't preclude the

Neia Neutuladh (14/19) Nov 19 2018 Programmers coming from nearly any language other than C++ would find it...

Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso (13/20) Nov 19 2018 What do you think about making the syntax slightly more explicit

NoMoreBugs (22/24) Nov 19 2018 You could try testing for null before dereferencing ;-)
Neia Neutuladh (21/42) Nov 19 2018 The prevailing idea is that warnings are either non-problems, in which

aliak (19/65) Nov 19 2018 Think this would highly depend on your usecase. Having crashing

aliak (4/19) Nov 19 2018 You can give optional (https://code.dlang.org/packages/optional)
Dukc (2/6) Dec 04 2018 There are others too who feel like that too:

welkam (10/12) Nov 20 2018 Because the more you learn about D the less you want to use

Adam D. Ruppe (3/5) Nov 20 2018 classes rock. You just initialize it. You're supposed to

NoMoreBugs (7/12) Nov 21 2018 a fan of classes...on the D forum? I don't get it.

welkam (27/41) Nov 21 2018 One thing that bugs me in programming is that in different

Neia Neutuladh (44/69) Nov 21 2018 Even if you mark your class as final, it has a vtable because it inherit...

Kagamin (7/11) Nov 20 2018 D is more similar to Java here and works like languages with

Jonathan M Davis (20/31) Nov 20 2018 In C++, if the class is put directly on the stack, then you get a simila...

Per =?UTF-8?B?Tm9yZGzDtnc=?= (8/22) Nov 22 2018 The natural way forward for D is to add static analysis in the

Stefan Koch (3/9) Nov 22 2018 I'd say the problem here is not just false positives, but false

Neia Neutuladh (5/7) Nov 22 2018 False negatives are a small problem. The compiler fails to catch some
Per =?UTF-8?B?Tm9yZGzDtnc=?= (8/10) Nov 22 2018 With emphasis on _incremental_ additions to the compiler for

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/7) Nov 23 2018 BTW, should such a compiler checking in D include pointers beside

SimonN (28/30) Nov 29 2018 100 % agree that there should be non-nullable class references,

Atila Neves (4/11) Nov 30 2018 https://github.com/aliak00/optional/blob/master/source/optional/notnull....

12345swordy (5/19) Nov 30 2018 Huh neat, though it would nice to allow conversion of Nullable to

12345swordy (2/23) Nov 30 2018 I meant new MyClassImp(), but you get the idea.
Kagamin (3/5) Dec 01 2018 AFAIK it's something like

O-N-S (ozan) (7/14) Nov 29 2018 On Monday, 19 November 2018 at 21:23:31

Atila Neves (3/20) Nov 30 2018 A better way is to always initialise.

PacMan (7/21) Dec 04 2018 This is because you're transfering what you know from C++ to D,

Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso <jordigh octave.org> writes:

When I was first playing with D, I managed to create a segfault 
by doing `SomeClass c;` and then trying do something with the 
object I thought I had default-created, by analogy with C++ 
syntax. Seasoned D programmers will recognise that I did nothing 
of the sort and instead created c is null and my program ended up 
dereferencing a null pointer.

I'm not the only one who has done this. I can't find it right 
now, but I've seen at least one person open a bug report because 
they misunderstood this as a bug in dmd.

I have been told a couple of times that this isn't something that 
needs to be patched in the language, but I don't understand. It 
seems like a very easy way to generate a segfault (and not a 
NullPointerException or whatever).

What's the reasoning for allowing this?

Nov 19 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 What's the reasoning for allowing this?

The mistake is immediately obvious when you run the program, so I 
just don't see it as a big deal. You lose a matter of seconds, 
realize the mistake, and fix it.

What is your proposal for handling it? The ones usually put 
around are kinda a pain to use.

Nov 19 2018

aberba <karabutaworld gmail.com> writes:

On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?

 The mistake is immediately obvious when you run the program, so 
 I just don't see it as a big deal. You lose a matter of 
 seconds, realize the mistake, and fix it.

 What is your proposal for handling it? The ones usually put 
 around are kinda a pain to use.

Does D have a linter which warns about certain style of coding 
like this?

Nov 20 2018

Kagamin <spam here.lot> writes:

On Tuesday, 20 November 2018 at 09:27:03 UTC, aberba wrote:
 Does D have a linter which warns about certain style of coding 
 like this?

AFAIK, dscanner does some linting.

Nov 20 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Tuesday, 20 November 2018 at 09:27:03 UTC, aberba wrote:
 Does D have a linter which warns about certain style of coding 
 like this?

dscanner might check it. I don't know though.

Nov 20 2018

aliak <something something.com> writes:

On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?

 The mistake is immediately obvious when you run the program, so 
 I just don't see it as a big deal. You lose a matter of 
 seconds, realize the mistake, and fix it.

This only applies to little scripts and unittests maybe.

Not when you're writing any kind of relatively larger application 
that involves being run for longer or if there's more possible 
permutations of your state variables.

Nov 20 2018

Kagamin <spam here.lot> writes:

On Tuesday, 20 November 2018 at 11:11:43 UTC, aliak wrote:
 This only applies to little scripts and unittests maybe.

 Not when you're writing any kind of relatively larger 
 application that involves being run for longer or if there's 
 more possible permutations of your state variables.

Umm... if you write a larger application not knowing what is a 
reference type, you're into lots and lots of problems.

Nov 20 2018

Neia Neutuladh <neia ikeran.org> writes:

On Tue, 20 Nov 2018 15:29:50 +0000, Kagamin wrote:
 On Tuesday, 20 November 2018 at 11:11:43 UTC, aliak wrote:
 This only applies to little scripts and unittests maybe.

 Not when you're writing any kind of relatively larger application that
 involves being run for longer or if there's more possible permutations
 of your state variables.

 
 Umm... if you write a larger application not knowing what is a reference
 type, you're into lots and lots of problems.

A pointer to a struct is a reference type.

There are plenty of cases where you need reference semantics for a thing. 
If you are optimistic about the attentiveness of your future self and 
potential collaborators, you might add that into a doc comment; people can 
either manually do escape analysis to check if they can store the thing on 
the stack, or allocate on the heap and pass a pointer, or embed the thing 
as a private member of another thing that now must be passed by reference.

If you're pessimistic and don't mind a potential performance decrease, you 
might defensively use a class to ensure the thing is a reference type.

Nov 20 2018

Aliak <something something.com> writes:

On Tuesday, 20 November 2018 at 15:29:50 UTC, Kagamin wrote:
 On Tuesday, 20 November 2018 at 11:11:43 UTC, aliak wrote:
 This only applies to little scripts and unittests maybe.

 Not when you're writing any kind of relatively larger 
 application that involves being run for longer or if there's 
 more possible permutations of your state variables.

 Umm... if you write a larger application not knowing what is a 
 reference type, you're into lots and lots of problems.

I’m not sure I understood your point? I was saying that you don’t 
necessarily hit your null dereference within a couple of seconds.

Nov 20 2018

Chris Katko <ckatko gmail.com> writes:

 Try to learn D.
 Put writeln in deconstructor to prove it works as expected
 Make random changes, program never runs again.
 Takes 30+ minutes to realize that writeln("my string") is fine, 
 but writeln("my string " ~ value) is an allocation / garbage 
 collection which crashes the program without a stack.

My favorite D'ism so far:

Nov 20 2018

NoMoreBugs <NoMoreBugs gmail.com> writes:

On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?

 The mistake is immediately obvious when you run the program, so 
 I just don't see it as a big deal. You lose a matter of 
 seconds, realize the mistake, and fix it.

 What is your proposal for handling it? The ones usually put 
 around are kinda a pain to use.

How hard would it be, really, for the compiler to determine that 
c was never assigned to, and produce a compile time error:

"c is never assigned to, and will always have its default value 
null"

That doesn't sound that hard to me.

Nov 21 2018

Alex <sascha.orlov gmail.com> writes:

On Wednesday, 21 November 2018 at 10:47:35 UTC, NoMoreBugs wrote:
 On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe 
 wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?

 The mistake is immediately obvious when you run the program, 
 so I just don't see it as a big deal. You lose a matter of 
 seconds, realize the mistake, and fix it.

 What is your proposal for handling it? The ones usually put 
 around are kinda a pain to use.

 How hard would it be, really, for the compiler to determine 
 that c was never assigned to, and produce a compile time error:

 "c is never assigned to, and will always have its default value 
 null"

 That doesn't sound that hard to me.

Am I misled, or isn't this impossible by design?

´´´
import std.stdio;
import std.random;

class C
{
	size_t dummy;
	final void baz()
	{
		if(this is null)
		{
			writeln(42);
		}
		else
		{
			writeln(dummy);
		}
	}
}
void main()
{
	C c;
	c.foo;
}

void foo(ref C c)
{
	if(uniform01 < 0.5)
	{
		c = new C();
		c.dummy = unpredictableSeed;
	}
	c.baz;
}
´´´

Nov 21 2018

Kagamin <spam here.lot> writes:

On Wednesday, 21 November 2018 at 11:53:14 UTC, Alex wrote:
 Am I misled, or isn't this impossible by design?

 ´´´
 import std.stdio;
 import std.random;

 class C
 {
 	size_t dummy;
 	final void baz()
 	{
 		if(this is null)
 		{
 			writeln(42);
 		}
 		else
 		{
 			writeln(dummy);
 		}
 	}
 }
 void main()
 {
 	C c;
 	c.foo;
 }

 void foo(ref C c)
 {
 	if(uniform01 < 0.5)
 	{
 		c = new C();
 		c.dummy = unpredictableSeed;
 	}
 	c.baz;
 }
 ´´´


would reject to call function foo.

Nov 21 2018

Alex <sascha.orlov gmail.com> writes:

On Wednesday, 21 November 2018 at 14:21:44 UTC, Kagamin wrote:
 A value passed to ref parameter is assumed to be initialized. 


This was not my point. I wonder, whether the case, where the 
compiler can't figure out the initialization state of an object 
is so hard to construct.

´´´
import std.experimental.all;

class C
{
	size_t dummy;
	final void baz()
	{
		if(this is null)
		{
			writeln(42);
		}
		else
		{
			writeln(dummy);
		}
	}
}
void main()
{
	C c;
	if(uniform01 < 0.5)
	{
		c = new C();
		c.dummy = unpredictableSeed;
	}
         else
         {
                 c = null;
         }
	c.baz;
	writeln(c is null);
}
´´´

Nov 21 2018

Neia Neutuladh <neia ikeran.org> writes:

On Wed, 21 Nov 2018 17:00:29 +0000, Alex wrote:



in D, it compiles and runs and doesn't segfault.

Nov 21 2018

Alex <sascha.orlov gmail.com> writes:

On Wednesday, 21 November 2018 at 17:09:54 UTC, Neia Neutuladh 
wrote:
 On Wed, 21 Nov 2018 17:00:29 +0000, Alex wrote:



 object), but in D, it compiles and runs and doesn't segfault.

No, it wouldn't. And it doesn't.


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Diagnostics;

namespace ConsoleApp1
{
     sealed class C
     {
         public int dummy;
         public void baz()
         {
             if (this is null)
             {
                 Debug.WriteLine(42);
             }
             else
             {
                 Debug.WriteLine(dummy);
             }
         }
     }
     class Program
     {
         static void Main(string[] args)
         {
             C c;
             Random random = new Random(4);
             int randomNumber = random.Next(0, 100);
             if (randomNumber < 50)
             {
                 c = new C
                 {
                     dummy = 73
                 };
             }
             else
             {
                 c = null;
             }
             c.baz();
         }
     }
}
´´´
compiled against 4.6.1 Framework.

However, of course, there is a NullReferenceException, if c 
happens to be null, when calling baz.

So the difference is not the compiler behavior, but just the 
runtime behavior...

How could the compiler know the state of Random anyway, before 
the program run.

Nov 21 2018

aliak <something something.com> writes:

On Wednesday, 21 November 2018 at 17:46:29 UTC, Alex wrote:
 compiled against 4.6.1 Framework.

 However, of course, there is a NullReferenceException, if c 
 happens to be null, when calling baz.

 So the difference is not the compiler behavior, but just the 
 runtime behavior...

 How could the compiler know the state of Random anyway, before 
 the program run.

The compiler would not be able to prove that something was 

but swift certainly does:

class C {
     func baz() {}
}

func f() {
     var x: C
     if Int.random(in: 0 ..< 10) < 5 {
         x = C()
     }
     x.baz()
}

error: variable 'x' used before being initialized

Nov 21 2018

Alex <sascha.orlov gmail.com> writes:

On Wednesday, 21 November 2018 at 21:05:37 UTC, aliak wrote:
 On Wednesday, 21 November 2018 at 17:46:29 UTC, Alex wrote:
 compiled against 4.6.1 Framework.

 However, of course, there is a NullReferenceException, if c 
 happens to be null, when calling baz.

 So the difference is not the compiler behavior, but just the 
 runtime behavior...

 How could the compiler know the state of Random anyway, before 
 the program run.

 The compiler would not be able to prove that something was 

 it but swift certainly does:

 class C {
     func baz() {}
 }

 func f() {
     var x: C
     if Int.random(in: 0 ..< 10) < 5 {
         x = C()
     }
     x.baz()
 }

 error: variable 'x' used before being initialized

Nice! Didn't know that... But the language is a foreign one for 
me.

Nevertheless, from what I saw:
Shouldn't it be
var x: C?
as an optional kind, because otherwise, I can't assign a nil to 
the instance, which I can do to a class instance in D...

:) )

Comparing non-optional types from swift with classes in D is... 
yeah... hmm... evil ;)

And if you assume a kind which cannot be nil, then you are again 
with structs here...

But I wondered about something different:
Even if the compiler would check the existence of an assignment, 
the runtime information cannot be deduced, if I understand this 
correctly. And if so, it cannot be checked at compile time, if 
something is or is not null. Right?

Nov 21 2018

aliak <something something.com> writes:

On Wednesday, 21 November 2018 at 23:27:25 UTC, Alex wrote:
 Nice! Didn't know that... But the language is a foreign one for 
 me.

 Nevertheless, from what I saw:
 Shouldn't it be
 var x: C?
 as an optional kind, because otherwise, I can't assign a nil to 
 the instance, which I can do to a class instance in D...

 out! :) )

This is true. But then the difference is that you can't* call a 
method on an optional variable without first unwrapping it (which 
is enforced at compile time as well).

* You can force unwrap it and then you'd get a segfault if it 
there was nothing inside the optional. But most times if you see 
someone force unwrapping an optional it's a code smell in swift.

 Comparing non-optional types from swift with classes in D is... 
 yeah... hmm... evil ;)

Hehe, maybe in a way. Was just trying to show that compilers can 
fix the null reference "problem" at compile time. And that flow 
analysis can detect initialization.

 And if you assume a kind which cannot be nil, then you are 
 again with structs here...

 But I wondered about something different:
 Even if the compiler would check the existence of an 
 assignment, the runtime information cannot be deduced, if I 
 understand this correctly. And if so, it cannot be checked at 
 compile time, if something is or is not null. Right?

Aye. But depending on how a language is designed this problem - 
if you think it is one - can be dealt with. It's why swift has 
optionals built in to the language.

Nov 22 2018

Kagamin <spam here.lot> writes:

On Wednesday, 21 November 2018 at 17:00:29 UTC, Alex wrote:
 This was not my point. I wonder, whether the case, where the 
 compiler can't figure out the initialization state of an object 
 is so hard to construct.

 ´´´
 import std.experimental.all;

 class C
 {
 	size_t dummy;
 	final void baz()
 	{
 		if(this is null)
 		{
 			writeln(42);
 		}
 		else
 		{
 			writeln(dummy);
 		}
 	}
 }
 void main()
 {
 	C c;
 	if(uniform01 < 0.5)
 	{
 		c = new C();
 		c.dummy = unpredictableSeed;
 	}
         else
         {
                 c = null;
         }
 	c.baz;
 	writeln(c is null);
 }
 ´´´



As `c` is initialized in both branches, compiler knows it's 
always in initialized state after the if statement.

Nov 22 2018

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 21 November 2018 at 10:47:35 UTC, NoMoreBugs wrote:
 On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe 
 wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?

 The mistake is immediately obvious when you run the program, 
 so I just don't see it as a big deal. You lose a matter of 
 seconds, realize the mistake, and fix it.

 What is your proposal for handling it? The ones usually put 
 around are kinda a pain to use.

 How hard would it be, really, for the compiler to determine 
 that c was never assigned to, and produce a compile time error:

 "c is never assigned to, and will always have its default value 
 null"

 That doesn't sound that hard to me.

For _TRIVIAL_cases this is not hard.

But we cannot only worry about trivial cases;
We have to consider _all_ cases.

Therefore we better not emit an error in a trivial case.
Which could lead users to assume that we are detecting all the 
cases.
That in turn will give the impression of an unreliable system, 
and indeed that impression would not be too far from the truth.

Nov 21 2018

NoMoreBugs <NoMoreBugs gmail.com> writes:

On Wednesday, 21 November 2018 at 17:11:23 UTC, Stefan Koch wrote:
 For _TRIVIAL_cases this is not hard.

 But we cannot only worry about trivial cases;
 We have to consider _all_ cases.

 Therefore we better not emit an error in a trivial case.
 Which could lead users to assume that we are detecting all the 
 cases.
 That in turn will give the impression of an unreliable system, 
 and indeed that impression would not be too far from the truth.

On the face of it, that seems a reasonable argument. i.e. 
Consistency.

On the other-hand, I see nothing 'reliable' about handing off the 
responsibility of detecting run-time errors, to the o/s ;-)

I would prefer to catch these errors at compile time, or run time.

D can do neither it seems.

Nov 21 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/19/18 4:23 PM, Jordi Gutiérrez Hermoso wrote:
 When I was first playing with D, I managed to create a segfault by doing 
 `SomeClass c;` and then trying do something with the object I thought I 
 had default-created, by analogy with C++ syntax. Seasoned D programmers 
 will recognise that I did nothing of the sort and instead created c is 
 null and my program ended up dereferencing a null pointer.
 
 I'm not the only one who has done this. I can't find it right now, but 
 I've seen at least one person open a bug report because they 
 misunderstood this as a bug in dmd.
 
 I have been told a couple of times that this isn't something that needs 
 to be patched in the language, but I don't understand. It seems like a 
 very easy way to generate a segfault (and not a NullPointerException or 
 whatever).
 
 What's the reasoning for allowing this?

A null pointer dereference is an immediate error, and it's also a safe 
error. It does not cause corruption, and it is free (the MMU is doing it 
for you).

Note, you can get a null pointer exception on Linux by using 
etc.linux.memoryerror: 
https://github.com/dlang/druntime/blob/master/src/etc/linux/memoryerror.d

The worst part about a null-pointer segfault is when it's intermittent 
and you get no information about where it happens. Then it can be 
annoying to track down. But it can't be used as an exploit.

Consistent segfaults are generally easy to figure out.

-Steve

Nov 19 2018

Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso <jordigh octave.org> writes:

On Monday, 19 November 2018 at 21:52:47 UTC, Steven Schveighoffer 
wrote:

 A null pointer dereference is an immediate error, and it's also 
 a safe error. It does not cause corruption, and it is free (the 
 MMU is doing it for you).

Is this always true for all arches that D can compile to? I 
remember back in the DOS days with no memory protection you 
really could read OS data around the beginning.

 Consistent segfaults are generally easy to figure out.

I think I would still prefer a stack trace like other kinds of D 
errors. Is this too difficult?

Nov 19 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/19/18 7:21 PM, Jordi Gutiérrez Hermoso wrote:
 On Monday, 19 November 2018 at 21:52:47 UTC, Steven Schveighoffer wrote:
 
 A null pointer dereference is an immediate error, and it's also a safe 
 error. It does not cause corruption, and it is free (the MMU is doing 
 it for you).

 
 Is this always true for all arches that D can compile to? I remember 
 back in the DOS days with no memory protection you really could read OS 
 data around the beginning.

It's true for all OSes that D supports, and for most modern operating 
systems, that run in protected mode.

It would NOT necessarily be true for kernel modules or an OS kernel, so 
that is something to be concerned about.

 Consistent segfaults are generally easy to figure out.

 
 I think I would still prefer a stack trace like other kinds of D errors. 
 Is this too difficult?

Yes and no. It's good to remember that this is a HARDWARE generated 
exception, and each OS handles it differently. It's also important to 
remember that a segmentation fault is NOT necessarily the result of a 
simple error like forgetting to initialize a variable. It could be a 
serious memory corruption error. Generating stack traces can be 
dangerous in this kind of state.

As I said, on Linux you can enable a "hack" that generates an error for 
a null dereference. On Windows, I believe that it already generates an 
exception without any modification.

On other OSes you may be out of luck until someone figures out a nice 
clever hack for it.

And if it's repeatable, you can always run in a debugger to see where 
the error is occurring.

-Steve

Nov 19 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, November 19, 2018 5:30:00 PM MST Steven Schveighoffer via 
Digitalmars-d-learn wrote:
 On 11/19/18 7:21 PM, Jordi Guti�rrez Hermoso wrote:
 On Monday, 19 November 2018 at 21:52:47 UTC, Steven Schveighoffer wrote:
 A null pointer dereference is an immediate error, and it's also a safe
 error. It does not cause corruption, and it is free (the MMU is doing
 it for you).

 Is this always true for all arches that D can compile to? I remember
 back in the DOS days with no memory protection you really could read OS
 data around the beginning.

 It's true for all OSes that D supports, and for most modern operating
 systems, that run in protected mode.

 It would NOT necessarily be true for kernel modules or an OS kernel, so
 that is something to be concerned about.

For  safe to function properly, dereferencing null _must_ be guaranteed to
be memory safe, and for dmd it is, since it will always segfault.
Unfortunately, as understand it, it is currently possible with ldc's
optimizer to run into trouble, since it'll do things like see that something
must be null and therefore assume that it must never be dereferenced, since
it would clearly be wrong to dereference it. And then when the code hits a
point where it _does_ try to dereference it, you get undefined behavior.
It's something that needs to be fixed in ldc, but based on discussions I had
with Johan at dconf this year about the issue, I suspect that the spec is
going to have to be updated to be very clear on how dereferencing null has
to be handled before the ldc guys do anything about it. As long as the
optimizer doesn't get involved everything is fine, but as great as
optimizers can be at making code faster, they aren't really written with
stuff like  safe in mind.

 Consistent segfaults are generally easy to figure out.

 I think I would still prefer a stack trace like other kinds of D errors.
 Is this too difficult?

 Yes and no. It's good to remember that this is a HARDWARE generated
 exception, and each OS handles it differently. It's also important to
 remember that a segmentation fault is NOT necessarily the result of a
 simple error like forgetting to initialize a variable. It could be a
 serious memory corruption error. Generating stack traces can be
 dangerous in this kind of state.

 As I said, on Linux you can enable a "hack" that generates an error for
 a null dereference. On Windows, I believe that it already generates an
 exception without any modification.

 On other OSes you may be out of luck until someone figures out a nice
 clever hack for it.

 And if it's repeatable, you can always run in a debugger to see where
 the error is occurring.

Also, if your OS supports core dumps, and you have them turned on, then it's
trivial to get a stack trace - as well as a lot more of the program state.

- Jonathan M Davis

Nov 19 2018

Johan Engelen <j j.nl> writes:

On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis 
wrote:
 For  safe to function properly, dereferencing null _must_ be 
 guaranteed to be memory safe, and for dmd it is, since it will 
 always segfault. Unfortunately, as understand it, it is 
 currently possible with ldc's optimizer to run into trouble, 
 since it'll do things like see that something must be null and 
 therefore assume that it must never be dereferenced, since it 
 would clearly be wrong to dereference it. And then when the 
 code hits a point where it _does_ try to dereference it, you 
 get undefined behavior. It's something that needs to be fixed 
 in ldc, but based on discussions I had with Johan at dconf this 
 year about the issue, I suspect that the spec is going to have 
 to be updated to be very clear on how dereferencing null has to 
 be handled before the ldc guys do anything about it. As long as 
 the optimizer doesn't get involved everything is fine, but as 
 great as optimizers can be at making code faster, they aren't 
 really written with stuff like  safe in mind.

One big problem is the way people talk and write about this 
issue. There is a difference between "dereferencing" in the 
language, and reading from a memory address by the CPU.
Confusing language semantics with what the CPU is doing happens 
often in the D community and is not helping these debates.

D is proclaiming that dereferencing `null` must segfault but that 
is not implemented by any of the compilers. It would require 
inserting null checks upon every dereference. (This may not be as 
slow as you may think, but it would probably not make code run 
faster.)

An example:
```
class A {
     int i;
     final void foo() {
      	import std.stdio; writeln(__LINE__);
         // i = 5;
     }
}

void main() {
     A a;
     a.foo();
}
```

In this case, the actual null dereference happens on the last 
line of main. The program runs fine however since dlang 2.077.
Now when `foo` is modified such that it writes to member field 
`i`, the program does segfault (writes to address 0).
D does not make dereferencing on class objects explicit, which 
makes it harder to see where the dereference is happening.

So, I think all compiler implementations are not spec compliant 
on this point.
I think most people believe that compliance is too costly for the 
kind of software one wants to write in D; the issue is similar to 
array bounds checking that people explicitly disable or work 
around.
For compliance we would need to change the compiler to emit null 
checks on all  safe dereferences (the opposite direction was 
chosen in 2.077). It'd be interesting to do the experiment.

-Johan

Nov 20 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/20/18 1:04 PM, Johan Engelen wrote:
 On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis wrote:
 For  safe to function properly, dereferencing null _must_ be 
 guaranteed to be memory safe, and for dmd it is, since it will always 
 segfault. Unfortunately, as understand it, it is currently possible 
 with ldc's optimizer to run into trouble, since it'll do things like 
 see that something must be null and therefore assume that it must 
 never be dereferenced, since it would clearly be wrong to dereference 
 it. And then when the code hits a point where it _does_ try to 
 dereference it, you get undefined behavior. It's something that needs 
 to be fixed in ldc, but based on discussions I had with Johan at dconf 
 this year about the issue, I suspect that the spec is going to have to 
 be updated to be very clear on how dereferencing null has to be 
 handled before the ldc guys do anything about it. As long as the 
 optimizer doesn't get involved everything is fine, but as great as 
 optimizers can be at making code faster, they aren't really written 
 with stuff like  safe in mind.

 
 One big problem is the way people talk and write about this issue. There 
 is a difference between "dereferencing" in the language, and reading 
 from a memory address by the CPU.

In general, I always consider "dereferencing" the point at which code 
follows a pointer to read or write its data. The semantics of modifying 
the type to mean the data vs. the pointer to it, seems less interesting. 
Types are compiler internal things, the actual reads and writes are what 
cause the problems.

But really, it's the act of using a pointer to read/write the data it 
points at which causes the segfault. And in D, we assume that this 
action is  safe because of the MMU protecting the first page.

 Confusing language semantics with what the CPU is doing happens often in 
 the D community and is not helping these debates.
 
 D is proclaiming that dereferencing `null` must segfault but that is not 
 implemented by any of the compilers. It would require inserting null 
 checks upon every dereference. (This may not be as slow as you may 
 think, but it would probably not make code run faster.)
 
 An example:
 ```
 class A {
      int i;
      final void foo() {
           import std.stdio; writeln(__LINE__);
          // i = 5;
      }
 }
 
 void main() {
      A a;
      a.foo();
 }
 ```
 
 In this case, the actual null dereference happens on the last line of 
 main. The program runs fine however since dlang 2.077.

Right, the point is that the segfault happens when null pointers are 
used to get at the data. If you turn something that is ultimately a 
pointer into another type of pointer, then you aren't dereferencing it 
really. This happens when you pass *pointer into a function that takes a 
reference (or when you pass around a class reference).

In any case, the prior versions to 2.077 didn't segfault, they just had 
a prelude in front of every function which asserted that this wasn't 
null (you actually get a nice stack trace).

 Now when `foo` is modified such that it writes to member field `i`, the 
 program does segfault (writes to address 0).
 D does not make dereferencing on class objects explicit, which makes it 
 harder to see where the dereference is happening.

Again, the terms are confusing. You just said the dereference happens at 
a.foo(), right? I would consider the dereference to happen when the 
object's data is used. i.e. when you read or write what the pointer 
points at.

 
 So, I think all compiler implementations are not spec compliant on this 
 point.

I think if the spec says that dereferencing doesn't mean following a 
pointer to it's data, and reading/writing that data, and it says null 
dereferences cause a segfault, then the spec needs to be updated. The 
 safe segfault is what it should be focused on, not some abstract 
concept that exists only in the compiler.

If it means changing the terminology, then we should do that.

 I think most people believe that compliance is too costly for the kind 
 of software one wants to write in D; the issue is similar to array 
 bounds checking that people explicitly disable or work around.
 For compliance we would need to change the compiler to emit null checks 
 on all  safe dereferences (the opposite direction was chosen in 2.077). 
 It'd be interesting to do the experiment.

The whole point of using the MMU instead of instrumentation is because 
we can avoid the performance penalties and still be safe. The only 
loophole is large structures that may extend beyond the protected data. 
I would suggest that the compiler inject extra reads of the front of any 
data type in that case (when  safe is enabled) to cause a segfault properly.

-Steve

Nov 20 2018

Johan Engelen <j j.nl> writes:

On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, which 
 makes it harder to see where the dereference is happening.

 Again, the terms are confusing. You just said the dereference 
 happens at a.foo(), right? I would consider the dereference to 
 happen when the object's data is used. i.e. when you read or 
 write what the pointer points at.

But `a.foo()` is already using the object's data: it is accessing 
a function of the object and calling it. Whether it is a virtual 
function, or a final function, that shouldn't matter. There are 
different ways of implementing class function calls, but here 
often people seem to pin things down to one specific way. I feel 
I stand alone in the D community in treating the language in this 
abstract sense (like C and C++ do, other languages I don't know). 
It's similar to that people think that local variables and the 
function return address are put on a stack; even though that is 
just an implementation detail that is free to be changed (and 
does often change: local variables are regularly _not_ stored on 
the stack [*]).

Optimization isn't allowed to change behavior of a program, yet 
already simple dead-code-elimination would when null dereference 
is not treated as UB or when it is not guarded by a null check. 
Here is an example of code that also does what you call a 
"dereference" (read object data member):
```
class A {
     int i;
     final void foo() {
         int a = i; // no crash with -O
     }
}

void main() {
     A a;
     a.foo();  // dereference happens
}
```

When you don't call `a.foo()` a dereference, you basically say 
that `this` is allowed to be `null` inside a class member 
function. (and then it'd have to be normal to do `if (this) ...` 
inside class member functions...)

These discussions are hard to do on a mailinglist, so I'll stop 
here. Until next time at DConf, I suppose... ;-)

-Johan

[*] intentionally didn't say where those local variables _are_ 
stored, so that people can solve that little puzzle for 
themselves ;-)

Nov 20 2018

Neia Neutuladh <neia ikeran.org> writes:

On Tue, 20 Nov 2018 23:14:27 +0000, Johan Engelen wrote:
 When you don't call `a.foo()` a dereference, you basically say that
 `this` is allowed to be `null` inside a class member function. (and then
 it'd have to be normal to do `if (this) ...` inside class member
 functions...)

That's what we have today:

    module scratch;
    import std.stdio;
    class A
    {
        int i;
        final void f()
        {
            writeln(this is null);
            writeln(i);
        }
    }
    void main()
    {
        A a;
        a.f();
    }

This prints `true` and then gets a segfault.

Virtual function calls have to do a dereference to figure out which 
potentially overrided function to call.

Nov 20 2018

Johan Engelen <j j.nl> writes:

On Wednesday, 21 November 2018 at 03:05:07 UTC, Neia Neutuladh 
wrote:
 Virtual function calls have to do a dereference to figure out 
 which potentially overrided function to call.

"have to do a dereference" in terms of "dereference" as language 
semantic: yes.
"have to do a dereference" in terms of "dereference" as reading 
from memory: no. If you have proof of the runtime type of an 
object, then you can use that information to have the CPU call 
the overrided function directly without reading from memory.

-Johan

Nov 21 2018

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Tuesday, 20 November 2018 at 23:14:27 UTC, Johan Engelen wrote:
 On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
 Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, 
 which makes it harder to see where the dereference is 
 happening.

 Again, the terms are confusing. You just said the dereference 
 happens at a.foo(), right? I would consider the dereference to 
 happen when the object's data is used. i.e. when you read or 
 write what the pointer points at.

 But `a.foo()` is already using the object's data: it is 
 accessing a function of the object and calling it. Whether it 
 is a virtual function, or a final function, that shouldn't 
 matter.

It matters a lot. A virtual function is a pointer that is in the 
instance, so there is a derefernce of the this pointer to get the 
address of the function.
For a final function, the address of the function is known at 
compile time and no dereferencing is necessary.

That is a thing that a lot of people do not get, a member 
function and a plain  function are basically the same thing. What 
distinguishes them, is their mangled name. You can call a non 
virtual member function from an assembly source if you know the 
symbol name.
UFCS uses this fact, that member function and plain function are 
indistinguishable in a object code point of view, to fake member 
functions.


 There are different ways of implementing class function calls, 
 but here often people seem to pin things down to one specific 
 way. I feel I stand alone in the D community in treating the 
 language in this abstract sense (like C and C++ do, other 
 languages I don't know). It's similar to that people think that 
 local variables and the function return address are put on a 
 stack; even though that is just an implementation detail that 
 is free to be changed (and does often change: local variables 
 are regularly _not_ stored on the stack [*]).

 Optimization isn't allowed to change behavior of a program, yet 
 already simple dead-code-elimination would when null 
 dereference is not treated as UB or when it is not guarded by a 
 null check. Here is an example of code that also does what you 
 call a "dereference" (read object data member):
 ```
 class A {
     int i;
     final void foo() {
         int a = i; // no crash with -O
     }
 }

 void main() {
     A a;
     a.foo();  // dereference happens
 }

No. There's no dereferencing. foo does nothing visible and can be 
replaced by a NOP. For the call, no dereferencing required.

 ```

 When you don't call `a.foo()` a dereference, you basically say

Again, no dereferencing for a (final) function call. `a.foo()` is 
the same thing as `foo(a)` by reverse UFCS. The generated code is 
identical. It is only the compiler that will use different 
mangled names.

 that `this` is allowed to be `null` inside a class member 
 function. (and then it'd have to be normal to do `if (this) 
 ...` inside class member functions...)

 These discussions are hard to do on a mailinglist, so I'll stop 
 here. Until next time at DConf, I suppose... ;-)

 -Johan

 [*] intentionally didn't say where those local variables _are_ 
 stored, so that people can solve that little puzzle for 
 themselves ;-)

Nov 21 2018

Johan Engelen <j j.nl> writes:

On Wednesday, 21 November 2018 at 09:31:41 UTC, Patrick Schluter 
wrote:
 On Tuesday, 20 November 2018 at 23:14:27 UTC, Johan Engelen 
 wrote:
 On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
 Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, 
 which makes it harder to see where the dereference is 
 happening.

 Again, the terms are confusing. You just said the dereference 
 happens at a.foo(), right? I would consider the dereference 
 to happen when the object's data is used. i.e. when you read 
 or write what the pointer points at.

 But `a.foo()` is already using the object's data: it is 
 accessing a function of the object and calling it. Whether it 
 is a virtual function, or a final function, that shouldn't 
 matter.

 It matters a lot. A virtual function is a pointer that is in 
 the instance, so there is a derefernce of the this pointer to 
 get the address of the function.
 For a final function, the address of the function is known at 
 compile time and no dereferencing is necessary.

 That is a thing that a lot of people do not get, a member 
 function and a plain  function are basically the same thing. 
 What distinguishes them, is their mangled name. You can call a 
 non virtual member function from an assembly source if you know 
 the symbol name.
 UFCS uses this fact, that member function and plain function 
 are indistinguishable in a object code point of view, to fake 
 member functions.

This and the rest of your email is exactly the kind of thinking 
that I oppose where language semantics and compiler 
implementation are being mixed. I don't think it's possible to 
write an optimizing compiler where that way of reasoning works. 
So D doesn't do that, and we have to treat language semantics 
separate from implementation details.  (virtual functions don't 
have to be implemented using vtables, local variables don't have 
to be on a stack, "a+b" does not need to result in a CPU add 
instruction, "foo()" does not need to result in a CPU procedure 
call instruction, etc, etc, etc. D is not a portable assembly 
language.)

-Johan

Nov 21 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/20/18 6:14 PM, Johan Engelen wrote:
 On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, which makes 
 it harder to see where the dereference is happening.

 Again, the terms are confusing. You just said the dereference happens 
 at a.foo(), right? I would consider the dereference to happen when the 
 object's data is used. i.e. when you read or write what the pointer 
 points at.

 
 But `a.foo()` is already using the object's data: it is accessing a 
 function of the object and calling it. Whether it is a virtual function, 
 or a final function, that shouldn't matter. There are different ways of 
 implementing class function calls, but here often people seem to pin 
 things down to one specific way. I feel I stand alone in the D community 
 in treating the language in this abstract sense (like C and C++ do, 
 other languages I don't know). It's similar to that people think that 
 local variables and the function return address are put on a stack; even 
 though that is just an implementation detail that is free to be changed 
 (and does often change: local variables are regularly _not_ stored on 
 the stack [*]).
 
 Optimization isn't allowed to change behavior of a program, yet already 
 simple dead-code-elimination would when null dereference is not treated 
 as UB or when it is not guarded by a null check. Here is an example of 
 code that also does what you call a "dereference" (read object data 
 member):
 ```
 class A {
      int i;
      final void foo() {
          int a = i; // no crash with -O
      }
 }
 
 void main() {
      A a;
      a.foo();  // dereference happens
 }
 ```

I get what you are saying. But in terms of memory safety *both results* 
are safe. The one where the code is eliminated is safe, and the one 
where the segfault happens is safe.

This is a tricky area, because D depends on a hardware feature for 
language correctness. In other words, it's perfectly possible for a null 
read or write to not result in a segfault, which would make D's 
allowance of dereferencing a null object without checking for null 
actually unsafe (now it's just another dangling pointer).

In terms of language semantics, I don't know what the right answer is. 
If we want to say that if an optimizer changes program behavior, the 
code must be UB, then this would have to be UB.

But I would prefer saying something like -- if a segfault occurs and the 
program continues, the system is in UB-land, but otherwise, it's fine. 
If this means an optimized program runs and a non-optimized one crashes, 
then that's what it means. I'd be OK with that result. It's like 
Schrodinger's segfault!

I don't know what it means in terms of compiler assumptions, so that's 
where my ignorance will likely get me in trouble :)

 These discussions are hard to do on a mailinglist, so I'll stop here. 
 Until next time at DConf, I suppose... ;-)

Maybe that is a good time to discuss for learning how things work. But 
clearly people would like to at least have a say here.

I still feel like using the hardware to deal with null access is OK, and 
a hard-crash is the best result for something that clearly would be UB 
otherwise.

-Steve

Nov 22 2018

Timon Gehr <timon.gehr gmx.ch> writes:

On 22.11.18 16:19, Steven Schveighoffer wrote:
 
 In terms of language semantics, I don't know what the right answer is. 
 If we want to say that if an optimizer changes program behavior, the 
 code must be UB, then this would have to be UB.
 
 But I would prefer saying something like -- if a segfault occurs and the 
 program continues, the system is in UB-land, but otherwise, it's fine. 
 If this means an optimized program runs and a non-optimized one crashes, 
 then that's what it means. I'd be OK with that result. It's like 
 Schrodinger's segfault!
 
 I don't know what it means in terms of compiler assumptions, so that's 
 where my ignorance will likely get me in trouble :)

This is called nondeterministic semantics, and it is a good idea if you 
want both efficiency and memory safety guarantees, but I don't know how 
well our backends would support it.

(However, I think it is necessary anyway, e.g. to give semantics to pure 
functions.)

Dec 03 2018

NoMoreBugs <NoMoreBugs gmail.com> writes:

On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
Schveighoffer wrote:
 But really, it's the act of using a pointer to read/write the 
 data it points at which causes the segfault. And in D, we 
 assume that this action is  safe because of the MMU protecting 
 the first page.

This is like me saying I won't bother locking up when I leave the 
house, cause if the alarm goes off the security company will come 
around and take care of things anyway.

But by then, it's too late.


 D is proclaiming that dereferencing `null` must segfault but 
 that is not implemented by any of the compilers. It would 
 require inserting null checks upon every dereference. (This 
 may not be as slow as you may think, but it would probably not 
 make code run faster.)


Aristotle would have immediately solved this dilemma.

Null is a valid value for reference types.
Dereferencing null can lead to bad things happening.
Therefore, check for null before dereferencing a reference type.

Problem solved.

Nov 20 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, November 20, 2018 11:04:08 AM MST Johan Engelen via Digitalmars-
d-learn wrote:
 On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis

 wrote:
 For  safe to function properly, dereferencing null _must_ be
 guaranteed to be memory safe, and for dmd it is, since it will
 always segfault. Unfortunately, as understand it, it is
 currently possible with ldc's optimizer to run into trouble,
 since it'll do things like see that something must be null and
 therefore assume that it must never be dereferenced, since it
 would clearly be wrong to dereference it. And then when the
 code hits a point where it _does_ try to dereference it, you
 get undefined behavior. It's something that needs to be fixed
 in ldc, but based on discussions I had with Johan at dconf this
 year about the issue, I suspect that the spec is going to have
 to be updated to be very clear on how dereferencing null has to
 be handled before the ldc guys do anything about it. As long as
 the optimizer doesn't get involved everything is fine, but as
 great as optimizers can be at making code faster, they aren't
 really written with stuff like  safe in mind.

 One big problem is the way people talk and write about this
 issue. There is a difference between "dereferencing" in the
 language, and reading from a memory address by the CPU.
 Confusing language semantics with what the CPU is doing happens
 often in the D community and is not helping these debates.

 D is proclaiming that dereferencing `null` must segfault but that
 is not implemented by any of the compilers. It would require
 inserting null checks upon every dereference. (This may not be as
 slow as you may think, but it would probably not make code run
 faster.)

 An example:
 ```
 class A {
      int i;
      final void foo() {
           import std.stdio; writeln(__LINE__);
          // i = 5;
      }
 }

 void main() {
      A a;
      a.foo();
 }
 ```

 In this case, the actual null dereference happens on the last
 line of main. The program runs fine however since dlang 2.077.
 Now when `foo` is modified such that it writes to member field
 `i`, the program does segfault (writes to address 0).
 D does not make dereferencing on class objects explicit, which
 makes it harder to see where the dereference is happening.

Yeah. It's one of those areas where the spec will need to be clear. Like
C++, D doesn't actually dereference unless it needs to. And IMHO, that's
fine. The core issue is that operations that aren't memory safe can't be
allowed to happen in  safe code, and the spec needs to be defined in such a
way that requires that that be true, though not necessarily by being super
specific about every detail about how a compiler is required to do it.

 So, I think all compiler implementations are not spec compliant
 on this point.
 I think most people believe that compliance is too costly for the
 kind of software one wants to write in D; the issue is similar to
 array bounds checking that people explicitly disable or work
 around.
 For compliance we would need to change the compiler to emit null
 checks on all  safe dereferences (the opposite direction was
 chosen in 2.077). It'd be interesting to do the experiment.

Ultimately here, the key thing is that it must be guaranteed that
dereferencing null is  safe in  safe code (regardless of whether that
involves * or . and regardless of how that is achieved). It must never read
from or write to invalid memory. If it can, then dereferencing a null
pointer or class reference is not memory safe, and since there's no way to
know whether a pointer or class reference is null or not via the type
system, dereferencing pointers and references in general would then be
 system, and that simply can't be the case, or  safe is completely broken.

Typically, that protection is done right now via segfaults, but we know that
that's not always possible. For instance, if the object is large enough
(larger than one page size IIRC), then attempting to dereference a null
pointer won't necessarily segfault. It can actually end up accessing invalid
memory if you try to access a member variable that's deep enough in the
object. I know that in that particular case, Walter's answer to the problem
is that such objects should be illegal in  safe code, but AFAIK, neither the
compiler nor the spec have yet been updated to match that decision, which
needs to be fixed. But regardless, in any and all cases where we determine
that a segfault won't necessarily protect against accessing invalid memory
when a null pointer or reference is dereferenced, then we need to do
_something_ to guarantee that that code is  safe - which probably means
adding additional null checks in most cases, though in the case of the
overly large object, Walter has a different solution.

IMHO, requiring something in the spec like "it must segfault when
dereferencing null" as has been suggested before is probably not a good idea
is really getting too specific (especially considering that some folks have
argued that not all architectures segfault like x86 does), but ultimately,
the question needs to be discussed with Walter. I did briefly discuss it
with him at this last dconf, but I don't recall exactly what he had to say
about the ldc optimization stuff. I _think_ that he was hoping that there
was a way to tell the optimizer to just not do that kind of optimization,
but I don't remember for sure. Ultimately, the two of you will probably have
to discuss it. Either way, I know that he wanted a bugzilla issue on the
topic, but I keep forgetting about it. First, I need to at least dig through
the spec to figure out what it actually says right now, which probably isn't
much.

- Jonathan M Davis

Nov 20 2018

Johan Engelen <j j.nl> writes:

On Wednesday, 21 November 2018 at 07:47:14 UTC, Jonathan M Davis 
wrote:
 IMHO, requiring something in the spec like "it must segfault 
 when dereferencing null" as has been suggested before is 
 probably not a good idea is really getting too specific 
 (especially considering that some folks have argued that not 
 all architectures segfault like x86 does), but ultimately, the 
 question needs to be discussed with Walter. I did briefly 
 discuss it with him at this last dconf, but I don't recall 
 exactly what he had to say about the ldc optimization stuff. I 
 _think_ that he was hoping that there was a way to tell the 
 optimizer to just not do that kind of optimization, but I don't 
 remember for sure.

The issue is not specific to LDC at all. DMD also does 
optimizations that assume that dereferencing [*] null is UB. The 
example I gave is dead-code-elimination of a dead read of a 
member variable inside a class method, which can only be done 
either if the spec says that`a.foo()` is UB when `a` is null, or 
if `this.a` is UB when `this` is null.

[*] I notice you also use "dereference" for an execution machine 
[**] reading from a memory address, instead of the language doing 
a dereference (which may not necessarily mean a read from memory).
[**] intentional weird name for the CPU? Yes. We also have D code 
running as webassembly...

-Johan

Nov 21 2018

Kagamin <spam here.lot> writes:

On Wednesday, 21 November 2018 at 22:24:06 UTC, Johan Engelen 
wrote:
 The issue is not specific to LDC at all. DMD also does 
 optimizations that assume that dereferencing [*] null is UB.

Do you have an example? I think it treats null dereference as 
implementation defined but otherwise safe.

Nov 22 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Wednesday, November 21, 2018 3:24:06 PM MST Johan Engelen via 
Digitalmars-d-learn wrote:
 On Wednesday, 21 November 2018 at 07:47:14 UTC, Jonathan M Davis

 wrote:
 IMHO, requiring something in the spec like "it must segfault
 when dereferencing null" as has been suggested before is
 probably not a good idea is really getting too specific
 (especially considering that some folks have argued that not
 all architectures segfault like x86 does), but ultimately, the
 question needs to be discussed with Walter. I did briefly
 discuss it with him at this last dconf, but I don't recall
 exactly what he had to say about the ldc optimization stuff. I
 _think_ that he was hoping that there was a way to tell the
 optimizer to just not do that kind of optimization, but I don't
 remember for sure.

 The issue is not specific to LDC at all. DMD also does
 optimizations that assume that dereferencing [*] null is UB. The
 example I gave is dead-code-elimination of a dead read of a
 member variable inside a class method, which can only be done
 either if the spec says that`a.foo()` is UB when `a` is null, or
 if `this.a` is UB when `this` is null.

 [*] I notice you also use "dereference" for an execution machine
 [**] reading from a memory address, instead of the language doing
 a dereference (which may not necessarily mean a read from memory).
 [**] intentional weird name for the CPU? Yes. We also have D code
 running as webassembly...

Skipping a dereference of null shouldn't be a problem as far as memory
safety goes. The issue is if the compiler decides that UB allows it do to
absolutely anything, and it rearranges the code in such a way that invalid
memory is accessed. That cannot be allowed in  safe code in any D compiler.
The code doesn't need to actually segfault, but it absolutely cannot access
invalid memory even when optimized.

Whether dmd's dead code elimination algorithm is able to make  safe code
unsafe, I don't know. I'm not familiar with dmd's internals, and in general,
while I have a basic understanding of the stuff at the various levels of a
compiler, once the discussion gets to stuff like machine instructions and
how the optimizer works, my understanding definitely isn't deep. After we
discussed this issue with regards to ldc at dconf, I brought it up with
Walter, and he didn't seem to think that dmd had such a problem, but I
didn't think to raise that particular possibility either. It wouldn't
surprise me if dmd also had issues in its optimizer that made  safe not
 safe, and it wouldn't surprise me if it didn't. It's the sort of area where
I'd expect that ldc's more aggressive optimizations to be much more likely
to run into trouble, and it's more likely to do things that Walter isn't
familiar with, but that doesn't mean that Walter didn't miss anything with
dmd either. After all, he does seem to like the idea of allowing the
optimizer to assume that assertions are true, and as far as I can tell based
on discussions on that topic, he doesn't seem to have understood (or maybe
just didn't agree) that if we did that, the optimizer can't be allowed to
make that assumption if there's any possibility of the code not being memory
safe if the assumption is wrong (at least not without violating the
guarantees that  safe is supposed to provide). Since if the assumption turns
out to be wrong (which is quite possible, even if it's not likely in
well-tested code), then  safe would then violate memory safety.

As I understand it, by definition,  safe code is supposed to not have
undefined behavior in it, and certainly, if any compiler's optimizer takes
undefined behavior as meaning that it can do whatever it wants at that point
with no restrictions (which is what I gathered from our discussion at
dconf), then I don't see how any D compiler's optimizer can be allowed to
think that anything is UB in  safe code. That may be why Walter was updating
various parts of the spec a while back to talk about compiler-defined as
opposed to undefined, since there are certainly areas where the compiler can
have leeway with what it does, but there are places (at least in  safe
code), where there must be restrictions on what it can assume and do even
when the implementation is given leeway, or  safe's memory safety guarantees
won't actually be properly guaranteed.

In any case, clearly this needs to be sorted out with Walter, and the D spec
needs to be updated in whatever manner best fixes the problem. Null pointers
/ references need to be guaranteed to be  safe in  safe code. Whether that's
going to require that the compiler insert additional null checks in at least
some places, I don't know. I simply don't know enough about how things work
with stuff like the optimizers, but it wouldn't surprise me if in at least
some cases, the compiler is ultimately going to be forced to insert null
checks. Certainly, at minimum, I think that it's quite clear that if a
platform doesn't segfault like x86 does, then it would have to.

- Jonathan M Davis

Nov 22 2018

Tony <tonytdominguez aol.com> writes:

isocpp.org just had a link to a blog post where someone makes a 
case for uninitialized variables in C++ being an advantage in 
that you can potentially get a warning regarding use of an 
uninitialized variable that points out an error in your code.

https://akrzemi1.wordpress.com/2018/11/22/treating-symptoms-instead-of-the-cause/

Nov 30 2018

Dukc <ajieskola gmail.com> writes:

On Saturday, 1 December 2018 at 00:32:35 UTC, Tony wrote:
 isocpp.org just had a link to a blog post where someone makes a 
 case for uninitialized variables in C++ being an advantage in 
 that you can potentially get a warning regarding use of an 
 uninitialized variable that points out an error in your code.

 https://akrzemi1.wordpress.com/2018/11/22/treating-symptoms-instead-of-the-cause/

This is grat when it works, but the problem is that it would be 
gargantuan effort -and compile time sink- to make it work 
perfectly. When it's just about if-else if chains, switches or 
boolean logic as in the example, the analysis won't be too 
complicated. But swap those booleans out for a string, and make 
the conditions to test whether it's a phone number, and whether 
it satisfies some predicate implemented in a foreign langauge, 
and you'll see where the problem is.

Dec 01 2018

Tony <tonytdominguez aol.com> writes:

On Saturday, 1 December 2018 at 11:16:49 UTC, Dukc wrote:
 This is great when it works, but the problem is that it would 
 be gargantuan effort -and compile time sink- to make it work 
 perfectly. When it's just about if-else if chains, switches or 
 boolean logic as in the example, the analysis won't be too 
 complicated. But swap those booleans out for a string, and make 
 the conditions to test whether it's a phone number, and whether 
 it satisfies some predicate implemented in a foreign language, 
 and you'll see where the problem is.

I think he is just talking about the compiler or static analyzer 
seeing if a variable has been given a value before it is used, 
not if it was given a valid value.

Dec 01 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Dec 01, 2018 at 06:30:05PM +0000, Tony via Digitalmars-d-learn wrote:
 On Saturday, 1 December 2018 at 11:16:49 UTC, Dukc wrote:
 This is great when it works, but the problem is that it would be
 gargantuan effort -and compile time sink- to make it work perfectly.
 When it's just about if-else if chains, switches or boolean logic as
 in the example, the analysis won't be too complicated. But swap
 those booleans out for a string, and make the conditions to test
 whether it's a phone number, and whether it satisfies some predicate
 implemented in a foreign language, and you'll see where the problem
 is.

 
 I think he is just talking about the compiler or static analyzer
 seeing if a variable has been given a value before it is used, not if
 it was given a valid value.

But that's precisely the problem. It's not always possible to tell
whether a variable has been initialized. E.g.:

	int func(int x) {
		int *p;

		if (solveRiemannHypothesis()) {
			p = &x;
		}

		...

		if (solveArtinsConjecture()) {
			*p++;
		}
		return x;
	}

For arbitrarily complex intervening code, determining whether or not a
certain code path would take (that would initialize the variable) is
equivalent to solving the halting problem, which is undecidable.

In the above contrived example, Artin's conjecture is implied by the
Riemann hypothesis, so the second if statement would only run if p is
initialized. But there is no way the compiler is going to be able to
deduce this, especially not during compile time. So it is not possible
to correctly flag p as being initialized or not when it is dereferenced.

Therefore, leaving it up to the compiler to detect uninitialized
variables is unreliable, and therefore any code that depends on this
cannot be trusted. Code like the above could be exploited by a
sufficiently sophisticated hack to make the uninitialized value of p
coincide with something that will open a security hole, and the compiler
would not be able to reliably warn the programmer of this problem.

Uninitialized variables are *not* a good thing, contrary to what the
author of the article might wish to believe.


T

-- 
The computer is only a tool. Unfortunately, so is the user. -- Armaphine, K5

Dec 01 2018

Sebastiaan Koppe <mail skoppe.eu> writes:

On Saturday, 1 December 2018 at 19:02:54 UTC, H. S. Teoh wrote:
 But that's precisely the problem. It's not always possible to 
 tell whether a variable has been initialized. E.g.:

 	int func(int x) {
 		int *p;

 		if (solveRiemannHypothesis()) {
 			p = &x;
 		}

 		...

 		if (solveArtinsConjecture()) {
 			*p++;
 		}
 		return x;
 	}

If you are willing to loose some precision you can still analyse 
this. Google abstract interpretation.

For instance, after the first if the value of p is (&x || null). 
Since the compiler can prove which branch is taken, the analyse 
has to assume both are.

Inside the second if, p gets dereferenced, but since p is (&x || 
null) - that is, it might be null - that is a compile time error.

The take away is that you don't need to know what code path will 
be taken, you just combine both states.

Dec 01 2018

Sebastiaan Koppe <mail skoppe.eu> writes:

On Saturday, 1 December 2018 at 20:41:53 UTC, Sebastiaan Koppe 
wrote:
 Since the compiler can prove which branch is taken, the analyse 
 has to assume both are.

*can't*

Dec 01 2018

aliak <something something.com> writes:

On Saturday, 1 December 2018 at 19:02:54 UTC, H. S. Teoh wrote:
 In the above contrived example, Artin's conjecture is implied 
 by the Riemann hypothesis, so the second if statement would 
 only run if p is initialized. But there is no way the compiler 
 is going to be able to deduce this, especially not during 
 compile time. So it is not possible to correctly flag p as 
 being initialized or not when it is dereferenced.

 Therefore, leaving it up to the compiler to detect 
 uninitialized variables is unreliable, and therefore any code 
 that depends on this cannot be trusted. Code like the above 
 could be exploited by a sufficiently sophisticated hack to make 
 the uninitialized value of p coincide with something that will 
 open a security hole, and the compiler would not be able to 
 reliably warn the programmer of this problem.

 Uninitialized variables are *not* a good thing, contrary to 
 what the author of the article might wish to believe.


 T

If a compiler were to issue warnings/error for uninitialized 
variables. Then that example would be a compiler error. The logic 
would just be that not all code paths lead to an initialized 
variable, therefor *p++ is not guaranteed to be initialized - 
i.e. error. Swift takes this approach.

Cheers,
- Ali

Dec 02 2018

Tony <tonytdominguez aol.com> writes:

On Saturday, 1 December 2018 at 19:02:54 UTC, H. S. Teoh wrote:

 But that's precisely the problem. It's not always possible to 
 tell whether a variable has been initialized. E.g.:

To me, the possibility of a "false positive" doesn't preclude the 
use of a warning unless that possibility is large. Besides using 
a compiler option or pragma to get rid of it, the warning also 
goes away if you assign NULL or (X *) 0. Surprisingly, clang (gcc 
6.3 does not give the warning) is not smart enough to then issue 
a "possibly dereferencing null pointer" warning.

 Therefore, leaving it up to the compiler to detect 
 uninitialized variables is unreliable, and therefore any code 
 that depends on this cannot be trusted. Code like the above 
 could be exploited by a sufficiently sophisticated hack to make 
 the uninitialized value of p coincide with something that will 
 open a security hole, and the compiler would not be able to 
 reliably warn the programmer of this problem.

I don't know that "leaving it up to the compiler" is a correct 
characterization. I don't see the programmer doing anything 
different with the warning capability in the compiler than if it 
wasn't there. In either case, the programmer will attempt to 
supply values to all the variables they have declared and are 
intending to use, and in the correct order.

Dec 02 2018

Neia Neutuladh <neia ikeran.org> writes:

On Mon, 19 Nov 2018 21:23:31 +0000, Jordi Gutiérrez Hermoso wrote:
 When I was first playing with D, I managed to create a segfault by doing
 `SomeClass c;` and then trying do something with the object I thought I
 had default-created, by analogy with C++ syntax. Seasoned D programmers
 will recognise that I did nothing of the sort and instead created c is
 null and my program ended up dereferencing a null pointer.

Programmers coming from nearly any language other than C++ would find it 
expected and intuitive that declaring a class instance variable leaves it 
null.

The compiler *could* give you a warning that you're using an uninitialized 
variable in a way that will lead to a segfault, but that sort of flow 
analysis gets hard fast.

If you wanted the default constructor to be called implicitly, that would 
make  nogc functions behave significantly differently (they'd forbid 
declarations without explicit initialization or would go back to default 
null), and it would be a problem for anything that doesn't have a no-args 
constructor (again, this would either be illegal or go back to null).

Easier for everything to be consistent and everything to be initialized to 
null.

Nov 19 2018

Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso <jordigh octave.org> writes:

On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh wrote:

 Programmers coming from nearly any language other than C++ 
 would find it expected and intuitive that declaring a class 
 instance variable leaves it null.

What do you think about making the syntax slightly more explicit 
and warn or possibly error out if you don't do it that way? Either

   SomeClass c = null;

or

   SomeClass c = new SomeClass();

and nothing else.

 The compiler *could* give you a warning that you're using an 
 uninitialized variable in a way that will lead to a segfault, 
 but that sort of flow analysis gets hard fast.

Nulls/Nones are always a big gap in a language's type system. A 
common alternative is to have some Option/Maybe type like Rust or 
Haskell or D's Variant. How about making that required to plug 
the null gap?

 If you wanted the default constructor to be called implicitly,

Yeah, maybe this bit of C++ syntax isn't the best idea. What 
about other alternatives?

Nov 19 2018

NoMoreBugs <NoMoreBugs gmail.com> writes:

On Tuesday, 20 November 2018 at 00:30:44 UTC, Jordi Gutiérrez 
Hermoso wrote:
 Yeah, maybe this bit of C++ syntax isn't the best idea. What 
 about other alternatives?

You could try testing for null before dereferencing ;-)

If the following code in D, did what you'd reasonably expect it 
to do, then you could do this:

-----
module test;

import std.stdio;

class C { public int x; }

void main()
{
   try
   {
     C c = null;
     c.x = 100;
   }
   catch(Exception e)
   {
     writeln(e.msg);
   }

}
-----

Nov 19 2018

Neia Neutuladh <neia ikeran.org> writes:

On Tue, 20 Nov 2018 00:30:44 +0000, Jordi Gutiérrez Hermoso wrote:
 On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh wrote:
 
 Programmers coming from nearly any language other than C++ would find
 it expected and intuitive that declaring a class instance variable
 leaves it null.

 
 What do you think about making the syntax slightly more explicit and
 warn or possibly error out if you don't do it that way?

The prevailing idea is that warnings are either non-problems, in which 
case they shouldn't be emitted, or things you really need to fix, in which 
case they should be errors.

Things that are sometimes errors can be left to lint tools.

 Either
 
    SomeClass c = null;
 
 or
 
    SomeClass c = new SomeClass();
 
 and nothing else.

That would work, though it would be mildly tedious.

However, the general philosophy with D is that things should be implicitly 
initialized to a default state equal to the `.init` property of the type. 
That default state can be user-defined with structs, but with other types, 
it is generally an 'empty' state that has well-defined semantics. For 
floating point values, that is NaN. For integers, it's 0. For arrays, it's 
a null array with length 0. For objects and pointers, it's null.

 Nulls/Nones are always a big gap in a language's type system. A common
 alternative is to have some Option/Maybe type like Rust or Haskell or
 D's Variant.

Variant is about storing arbitrary values in the same variable. Nullable 
is the D2 equivalent of Option or Maybe.

 How about making that required to plug the null gap?

That's extremely unlikely to make it into D2 and rather unlikely to make 
it into a putative D3. However, if you feel strongly enough about it, you 
can write a DIP.

I've used Kotlin with its null safety, and I honestly haven't seen 
benefits from it. I have seen some NullPointerExceptions in slightly 
different places and some NullPointerExceptions instead of empty strings in 
log messages, but that's it.

Nov 19 2018

aliak <something something.com> writes:

On Tuesday, 20 November 2018 at 03:24:56 UTC, Neia Neutuladh 
wrote:
 On Tue, 20 Nov 2018 00:30:44 +0000, Jordi Gutiérrez Hermoso 
 wrote:
 On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh 
 wrote:
 
 Programmers coming from nearly any language other than C++ 
 would find it expected and intuitive that declaring a class 
 instance variable leaves it null.

 
 What do you think about making the syntax slightly more 
 explicit and warn or possibly error out if you don't do it 
 that way?

 The prevailing idea is that warnings are either non-problems, 
 in which case they shouldn't be emitted, or things you really 
 need to fix, in which case they should be errors.

 Things that are sometimes errors can be left to lint tools.

 Either
 
    SomeClass c = null;
 
 or
 
    SomeClass c = new SomeClass();
 
 and nothing else.

 That would work, though it would be mildly tedious.

 However, the general philosophy with D is that things should be 
 implicitly initialized to a default state equal to the `.init` 
 property of the type. That default state can be user-defined 
 with structs, but with other types, it is generally an 'empty' 
 state that has well-defined semantics. For floating point 
 values, that is NaN. For integers, it's 0. For arrays, it's a 
 null array with length 0. For objects and pointers, it's null.

 Nulls/Nones are always a big gap in a language's type system. 
 A common alternative is to have some Option/Maybe type like 
 Rust or Haskell or D's Variant.

 Variant is about storing arbitrary values in the same variable. 
 Nullable is the D2 equivalent of Option or Maybe.

 How about making that required to plug the null gap?

 That's extremely unlikely to make it into D2 and rather 
 unlikely to make it into a putative D3. However, if you feel 
 strongly enough about it, you can write a DIP.

 I've used Kotlin with its null safety, and I honestly haven't 
 seen benefits from it. I have seen some NullPointerExceptions 
 in slightly different places and some NullPointerExceptions 
 instead of empty strings in log messages, but that's it.

Think this would highly depend on your usecase. Having crashing 
mobile apps mostly leads to bad reviews because it's a UX 
nightmare for e.g. And with webservices it's a pain a lot of the 
times when it just crashes as well (analytics workers for e.g.).

Kotlin's null safety stops you from this quite well as long as 
you don't interface with java libraries - then it's near useless 
because your compiler guarantees go out the window. But Swift... 
so far  ... 👌

It's also a code review blessing. You just know for sure that 
this code won't crash and the object is "valid" because they've 
properly unwrapped a nullable. I can't even count the number of 
times (and I'd wager there're millions of similar commits) where 
I've put up a commit (during my c++ days) that says "fix crash" 
and the code is just "if(!ptr) { return; }" or a variant of that.

Ok, sorry, I rambled a bit :p

Cheers,
- Ali

Nov 19 2018

aliak <something something.com> writes:

On Tuesday, 20 November 2018 at 00:30:44 UTC, Jordi Gutiérrez 
Hermoso wrote:
 On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh 
 wrote:

 [...]

 What do you think about making the syntax slightly more 
 explicit and warn or possibly error out if you don't do it that 
 way? Either

   SomeClass c = null;

 or

   SomeClass c = new SomeClass();

 and nothing else.

 [...]

 Nulls/Nones are always a big gap in a language's type system. A 
 common alternative is to have some Option/Maybe type like Rust 
 or Haskell or D's Variant. How about making that required to 
 plug the null gap?

You can give optional (https://code.dlang.org/packages/optional) 
a try and see if that works for you.

Nov 19 2018

Dukc <ajieskola gmail.com> writes:

 Nulls/Nones are always a big gap in a language's type system. A 
 common alternative is to have some Option/Maybe type like Rust 
 or Haskell or D's Variant. How about making that required to 
 plug the null gap?

There are others too who feel like that too:

https://news.ycombinator.com/item?id=18588239

Dec 04 2018

welkam <wwwelkam gmail.com> writes:

On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 Why does nobody seem to think that `null` is a serious problem 
 in D?

Because the more you learn about D the less you want to use 
classes. I view class as compatibility feature when you want to 
port Java code to D. For regular code just use structs. For 
inheritance you could use alias, for polymorphism - templates 
etc. If you want to write OOP code you dont have to always start 
with keyword class.

And since you dont use classes you dont view null as a high 
priority problem.

Nov 20 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Tuesday, 20 November 2018 at 13:27:28 UTC, welkam wrote:
 Because the more you learn about D the less you want to use 
 classes.

classes rock. You just initialize it. You're supposed to 
initialize *everything* anyway.

Nov 20 2018

NoMoreBugs <NoMoreBugs gmail.com> writes:

On Tuesday, 20 November 2018 at 15:46:35 UTC, Adam D. Ruppe wrote:
 On Tuesday, 20 November 2018 at 13:27:28 UTC, welkam wrote:
 Because the more you learn about D the less you want to use 
 classes.

 classes rock. You just initialize it. You're supposed to 
 initialize *everything* anyway.

a fan of classes...on the D forum? I don't get it.

but of course you are right. classes do rock!

In fact, there is not a better programming construct that I am 
aware of, the provides a better 'explicit' mapping from external 
objects to program constructs.

Thank you Kristen and Ole-Johan.

Nov 21 2018

welkam <wwwelkam gmail.com> writes:

On Wednesday, 21 November 2018 at 09:20:01 UTC, NoMoreBugs wrote:
 On Tuesday, 20 November 2018 at 15:46:35 UTC, Adam D. Ruppe 
 wrote:
 On Tuesday, 20 November 2018 at 13:27:28 UTC, welkam wrote:
 Because the more you learn about D the less you want to use 
 classes.

 classes rock. You just initialize it. You're supposed to 
 initialize *everything* anyway.

 a fan of classes...on the D forum? I don't get it.

 but of course you are right. classes do rock!

 In fact, there is not a better programming construct that I am 
 aware of, the provides a better 'explicit' mapping from 
 external objects to program constructs.

 Thank you Kristen and Ole-Johan.

One thing that bugs me in programming is that in different 
programming languages the same things are named differently and 
things that are named the same are different. For example D`s 

be considered the same also, but classes in C++ and Java are not 
the same thing.

In D classes are reference type and unless you mark them as final 
they will have vtable. Lets face it most people dont mark their 
classes as final. What all this mean is that EVERY access to 
class member value goes trough indirection (additional cost) and 
EVERY method call goes trough 2 indirections (one to get vtable 
and second to call function(method) from vtable). Now Java also 
have indirect vtable calls but it also have optimization passes 
that convert methods to final if they are not overridden. If Java 
didnt do that it would run as slow as Ruby. AFAIK D doesnt have 
such optimization pass. On top of that some people want to check 
on EVERY dereference if pointer is not null. How slow you want 
your programs to run?

Thats negatives but what benefit classes give us?
First being reference type its easy to move them in memory. That 
would be nice for compacting GC but D doesnt have compacting GC.
Second they are useful for when you need to run code that some 
one else wrote for your project. Something like plugin system. 
[sarcasm]This is happening everyday[/sarcasm]
Third porting code from Java to D.

Everything else you can do with struct and other D features.

Nov 21 2018

Neia Neutuladh <neia ikeran.org> writes:

On Wed, 21 Nov 2018 20:15:42 +0000, welkam wrote:
 In D classes are reference type and unless you mark them as final they
 will have vtable.

Even if you mark your class as final, it has a vtable because it inherits 
from Object, which has virtual functions. The ProtoObject proposal is for a 
base class that has no member functions. If you had a final class that 
inherited from ProtoObject instead of Object, it would have an empty 
vtable.

 Lets face it most people dont mark their classes as
 final. What all this mean is that EVERY access to class member value
 goes trough indirection (additional cost)

D classes support inheritance. They implicitly cast to their base types. 
They can add fields not present in their base types. If they were value 
types, this would mean you'd lose those fields when up-casting, and then 
you'd get memory corruption from calling virtual functions.

That is a cost that doesn't happen with structs, I'll grant, but the only 
way to avoid that cost is to give up inheritance. And inheritance is a 
large part of the reason to use classes instead of structs.

 and EVERY method call goes
 trough 2 indirections (one to get vtable and second to call
 function(method) from vtable).

Virtual functions do, that is. That's the vast majority of class member 
function calls.

 Now Java also have indirect vtable calls
 but it also have optimization passes that convert methods to final if
 they are not overridden. If Java didnt do that it would run as slow as
 Ruby.

Yeah, no.

https://git.ikeran.org/dhasenan/snippets/src/branch/master/virtualcalls/
results

Java and DMD both managed to de-virtualize and inline the function. DMD can 
do this in simple cases; Java can do this in a much wider range of cases 
but can make mistakes (and therefore has to insert guard code that will go 
back to the original bytecode when its hunches were wrong).

If it were merely devirtualization that were responsible for Java being 
faster than Ruby, Ruby might be ten times the duration of Java (just as 
dmd without optimizations is within times the duration of dmd without 
optimizations). You could also argue that `int += int` in Ruby is another 
virtual call, so it should be within twenty times the speed of Java.

Instead, it's 160 times slower than Java.

 On top of that some
 people want to check on EVERY dereference if pointer is not null. How
 slow you want your programs to run?

Every program on a modern CPU architecture and modern OS checks every 
pointer dereference to ensure the pointer isn't null. That's what a 
segfault is. Once you have virtual address space as a concept, this is 
free.

 Thats negatives but what benefit classes give us?
 First being reference type its easy to move them in memory. That would
 be nice for compacting GC but D doesnt have compacting GC.

You can do that with pointers, too. D doesn't do that because (a) it's 
difficult and we don't have the people required to make it work well 
enough, (b) it would make it harder to interface with other languages, (c) 
unions mean we would be unable to move some objects and people tend to be 
less thrilled about partial solutions than complete ones.

 Second they
 are useful for when you need to run code that some one else wrote for
 your project. Something like plugin system. [sarcasm]This is happening
 everyday[/sarcasm]
 Third porting code from Java to D.
 
 Everything else you can do with struct and other D features.

Similarly, you can write Java-style object oriented code in C. It's 
hideously ugly and rather error-prone. Every project trying to do it would 
do it in a different and incompatible way.

Walter decided a long time ago that language support for Java-style OOP 
was a useful component for D to have, and having a standardized way of 
doing it with proper language support was better than leaving it to a 
library.

Nov 21 2018

Kagamin <spam here.lot> writes:

On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault 
 by doing `SomeClass c;` and then trying do something with the 
 object I thought I had default-created, by analogy with C++ 
 syntax.

D is more similar to Java here and works like languages with 

in C++ objects are garbage-created by default, so you would have 
a similar problem there. To diagnose crashes on linux you can run 
your program under gdb.

Nov 20 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, November 20, 2018 8:38:40 AM MST Kagamin via Digitalmars-d-learn 
wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Guti�rrez

 Hermoso wrote:
 When I was first playing with D, I managed to create a segfault
 by doing `SomeClass c;` and then trying do something with the
 object I thought I had default-created, by analogy with C++
 syntax.

 D is more similar to Java here and works like languages with

 in C++ objects are garbage-created by default, so you would have
 a similar problem there. To diagnose crashes on linux you can run
 your program under gdb.

In C++, if the class is put directly on the stack, then you get a similar
situation to D's structs, only instead of it being default-initialized, it's
default-constructed. So, you don't normally get garbage when you just
declare a variable of a class type (though you do with other types, and
IIRC, if a class doesn't have a user-defined default constructor, and a
member variable's type doesn't have a default constructor, then that member
variable does end up being garbage).

However, if you declare a pointer to a class (which is really more analagous
to what you're doing when declaring a class reference in D), then it's most
definitely garbage, and the behavior is usually _far_ worse than
segfaulting. So, while I can see someone getting annoyed about a segfault,
because they forgot to initialize a class reference in D, the end result is
far, far safer than what C++ does. And in most cases, you catch the bug
pretty fast, because pretty much the only way that you don't catch it is if
that piece of code is never tested. So, while D's approach is by no means
perfect, I don't think that there's really any question that as far as
memory safety goes, it's far superior to C++.

- Jonathan M Davis

Nov 20 2018

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault 
 by doing `SomeClass c;` and then trying do something with the 
 object I thought I had default-created, by analogy with C++ 
 syntax. Seasoned D programmers will recognise that I did 
 nothing of the sort and instead created c is null and my 
 program ended up dereferencing a null pointer.

 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).

 What's the reasoning for allowing this?

The natural way forward for D is to add static analysis in the 
compiler that tracks use of possibly uninitialized classes (and 
perhaps also pointers). This has been discussed many times on the 
forums. The important thing with such an extra warning is to 
incrementally add it without triggering any false positives. 
Otherwise programmers aren't gonna use it.

Nov 22 2018

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 22 November 2018 at 15:38:18 UTC, Per Nordlöw wrote:

 The natural way forward for D is to add static analysis in the 
 compiler that tracks use of possibly uninitialized classes (and 
 perhaps also pointers). This has been discussed many times on 
 the forums. The important thing with such an extra warning is 
 to incrementally add it without triggering any false positives. 
 Otherwise programmers aren't gonna use it.

I'd say the problem here is not just false positives, but false 
negatives!

Nov 22 2018

Neia Neutuladh <neia ikeran.org> writes:

On Thu, 22 Nov 2018 15:50:01 +0000, Stefan Koch wrote:
 I'd say the problem here is not just false positives, but false
 negatives!

False negatives are a small problem. The compiler fails to catch some 
errors some of the time, and that's not surprising. False positives are 
highly vexing because it means the compiler rejects valid code, and that 
sometimes requires ugly circumlocutions to make it work.

Nov 22 2018

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 22 November 2018 at 15:50:01 UTC, Stefan Koch wrote:
 I'd say the problem here is not just false positives, but false 
 negatives!

With emphasis on _incremental_ additions to the compiler for 
covering more and more positives without introducing any _false_ 
negatives whatsoever. Without loosing compilation performance.

I recall Walter saying this is challenging to get right but a 
very interesting task.

This would make D even more competitive against languages such as 
Rust.

Nov 22 2018

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Thursday, 22 November 2018 at 23:10:06 UTC, Per Nordlöw wrote:
 With emphasis on _incremental_ additions to the compiler for 
 covering more and more positives without introducing any 
 _false_ negatives whatsoever. Without loosing compilation 
 performance.

BTW, should such a compiler checking in D include pointers beside 
mandatory class checking?

Nov 23 2018

SimonN <eiderdaus gmail.com> writes:

On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault

 What's the reasoning for allowing this?

100 % agree that there should be non-nullable class references, 
they're my main missing feature in D. Likewise, I'm astonished 
that only few D users wish for them.

I understand that it's very hard to get  safely right, without 
code-flow analysis that Walter prefers to keep at minimum 
throughout D.

I'm concerned about the clarity of usercode. I would like to 
ensure in my function signatures that only non-null class 
references are accepted as input, or that only non-null class 
references will be returned. All possibilities in current D have 
drawbacks:

a) Add in/out contracts for over 90 % of the class variables?
This is nasty boilerplate.

b) Check all arguments for null, check all returned values for 
null?
This is against the philosophy that null should be cost-free. 
Also boilerplate.

c) Declare the function as if it accepts null, but segfault on 
receiving null?
This looks like a bug in the program. Even if c) becomes a 
convention in the codebase, then when the function segfaults in 
the future, it's not clear to maintainers whether the function or 
the caller has the bug.

I discussed some ideas in 2018-03:
https://forum.dlang.org/post/epjwwtstyphqknavycxt forum.dlang.org

-- Simon

Nov 29 2018

Atila Neves <atila.neves gmail.com> writes:

On Thursday, 29 November 2018 at 18:31:41 UTC, SimonN wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 When I was first playing with D, I managed to create a segfault

 What's the reasoning for allowing this?

 100 % agree that there should be non-nullable class references, 
 they're my main missing feature in D. Likewise, I'm astonished 
 that only few D users wish for them.

https://github.com/aliak00/optional/blob/master/source/optional/notnull.d

"But I don't like the verbosity!"

alias MyClass = NotNullable!MyClassImpl;

Nov 30 2018

12345swordy <alexanderheistermann gmail.com> writes:

On Friday, 30 November 2018 at 12:00:46 UTC, Atila Neves wrote:
 On Thursday, 29 November 2018 at 18:31:41 UTC, SimonN wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 When I was first playing with D, I managed to create a 
 segfault

 What's the reasoning for allowing this?

 100 % agree that there should be non-nullable class 
 references, they're my main missing feature in D. Likewise, 
 I'm astonished that only few D users wish for them.

 https://github.com/aliak00/optional/blob/master/source/optional/notnull.d

 "But I don't like the verbosity!"

 alias MyClass = NotNullable!MyClassImpl;

Huh neat, though it would nice to allow conversion of Nullable to 
NotNullable via runtime conditional checking.

NotNullable!MyClassImpl = (MyClassImpvar != Null) ? MyClassImpvar 
: new MyClassImpvar();

Nov 30 2018

12345swordy <alexanderheistermann gmail.com> writes:

On Friday, 30 November 2018 at 15:32:55 UTC, 12345swordy wrote:
 On Friday, 30 November 2018 at 12:00:46 UTC, Atila Neves wrote:
 On Thursday, 29 November 2018 at 18:31:41 UTC, SimonN wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 [...]

 [...]

 100 % agree that there should be non-nullable class 
 references, they're my main missing feature in D. Likewise, 
 I'm astonished that only few D users wish for them.

 https://github.com/aliak00/optional/blob/master/source/optional/notnull.d

 "But I don't like the verbosity!"

 alias MyClass = NotNullable!MyClassImpl;

 Huh neat, though it would nice to allow conversion of Nullable 
 to NotNullable via runtime conditional checking.

 NotNullable!MyClassImpl = (MyClassImpvar != Null) ? 
 MyClassImpvar : new MyClassImpvar();

I meant new MyClassImp(), but you get the idea.

Nov 30 2018

Kagamin <spam here.lot> writes:

On Friday, 30 November 2018 at 15:32:55 UTC, 12345swordy wrote:
 NotNullable!MyClassImpl = (MyClassImpvar != Null) ? 
 MyClassImpvar : new MyClassImpvar();

AFAIK it's something like
NotNullable!MyClassImp m = MyClassImpvar.orElse(new MyClassImp());

Dec 01 2018

O-N-S (ozan) <ozan.nurettin.sueel gmail.com> writes:

On Monday, 19 November 2018 at 21:23:31
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).

I love Null in an empty class variable and I use it very often in 
my code. It simplifies a lot.

What would be a better way? (practical not theoretical)

Regards Ozan

Nov 29 2018

Atila Neves <atila.neves gmail.com> writes:

On Friday, 30 November 2018 at 06:15:29 UTC, O-N-S (ozan) wrote:
 On Monday, 19 November 2018 at 21:23:31
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).

 I love Null in an empty class variable and I use it very often 
 in my code. It simplifies a lot.

 What would be a better way? (practical not theoretical)

 Regards Ozan

A better way is to always initialise.

Invalid states should be unrepresentable.

Nov 30 2018

PacMan <jckj33 gmail.com> writes:

On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault 
 by doing `SomeClass c;` and then trying do something with the 
 object I thought I had default-created, by analogy with C++ 
 syntax. Seasoned D programmers will recognise that I did 
 nothing of the sort and instead created c is null and my 
 program ended up dereferencing a null pointer.

 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).

 What's the reasoning for allowing this?

This is because you're transfering what you know from C++ to D, 
directly. You shouldn't do that, check out how the specific 

Foo f; wouldn't make sense for me, it's not allocated, ti's null. 
So right away I used Foo f = new Foo();

Dec 04 2018

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Why does nobody seem to think that `null` is a serious problem in D?