www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Why does nobody seem to think that `null` is a serious problem in D?

reply Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso <jordigh octave.org> writes:
When I was first playing with D, I managed to create a segfault 
by doing `SomeClass c;` and then trying do something with the 
object I thought I had default-created, by analogy with C++ 
syntax. Seasoned D programmers will recognise that I did nothing 
of the sort and instead created c is null and my program ended up 
dereferencing a null pointer.

I'm not the only one who has done this. I can't find it right 
now, but I've seen at least one person open a bug report because 
they misunderstood this as a bug in dmd.

I have been told a couple of times that this isn't something that 
needs to be patched in the language, but I don't understand. It 
seems like a very easy way to generate a segfault (and not a 
NullPointerException or whatever).

What's the reasoning for allowing this?
Nov 19 2018
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 What's the reasoning for allowing this?
The mistake is immediately obvious when you run the program, so I just don't see it as a big deal. You lose a matter of seconds, realize the mistake, and fix it. What is your proposal for handling it? The ones usually put around are kinda a pain to use.
Nov 19 2018
next sibling parent reply aberba <karabutaworld gmail.com> writes:
On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?
The mistake is immediately obvious when you run the program, so I just don't see it as a big deal. You lose a matter of seconds, realize the mistake, and fix it. What is your proposal for handling it? The ones usually put around are kinda a pain to use.
Does D have a linter which warns about certain style of coding like this?
Nov 20 2018
next sibling parent Kagamin <spam here.lot> writes:
On Tuesday, 20 November 2018 at 09:27:03 UTC, aberba wrote:
 Does D have a linter which warns about certain style of coding 
 like this?
AFAIK, dscanner does some linting.
Nov 20 2018
prev sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 20 November 2018 at 09:27:03 UTC, aberba wrote:
 Does D have a linter which warns about certain style of coding 
 like this?
dscanner might check it. I don't know though.
Nov 20 2018
prev sibling next sibling parent reply aliak <something something.com> writes:
On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?
The mistake is immediately obvious when you run the program, so I just don't see it as a big deal. You lose a matter of seconds, realize the mistake, and fix it.
This only applies to little scripts and unittests maybe. Not when you're writing any kind of relatively larger application that involves being run for longer or if there's more possible permutations of your state variables.
Nov 20 2018
parent reply Kagamin <spam here.lot> writes:
On Tuesday, 20 November 2018 at 11:11:43 UTC, aliak wrote:
 This only applies to little scripts and unittests maybe.

 Not when you're writing any kind of relatively larger 
 application that involves being run for longer or if there's 
 more possible permutations of your state variables.
Umm... if you write a larger application not knowing what is a reference type, you're into lots and lots of problems.
Nov 20 2018
next sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Tue, 20 Nov 2018 15:29:50 +0000, Kagamin wrote:
 On Tuesday, 20 November 2018 at 11:11:43 UTC, aliak wrote:
 This only applies to little scripts and unittests maybe.

 Not when you're writing any kind of relatively larger application that
 involves being run for longer or if there's more possible permutations
 of your state variables.
Umm... if you write a larger application not knowing what is a reference type, you're into lots and lots of problems.
A pointer to a struct is a reference type. There are plenty of cases where you need reference semantics for a thing. If you are optimistic about the attentiveness of your future self and potential collaborators, you might add that into a doc comment; people can either manually do escape analysis to check if they can store the thing on the stack, or allocate on the heap and pass a pointer, or embed the thing as a private member of another thing that now must be passed by reference. If you're pessimistic and don't mind a potential performance decrease, you might defensively use a class to ensure the thing is a reference type.
Nov 20 2018
prev sibling parent reply Aliak <something something.com> writes:
On Tuesday, 20 November 2018 at 15:29:50 UTC, Kagamin wrote:
 On Tuesday, 20 November 2018 at 11:11:43 UTC, aliak wrote:
 This only applies to little scripts and unittests maybe.

 Not when you're writing any kind of relatively larger 
 application that involves being run for longer or if there's 
 more possible permutations of your state variables.
Umm... if you write a larger application not knowing what is a reference type, you're into lots and lots of problems.
I’m not sure I understood your point? I was saying that you don’t necessarily hit your null dereference within a couple of seconds.
Nov 20 2018
parent Chris Katko <ckatko gmail.com> writes:
 Try to learn D.
 Put writeln in deconstructor to prove it works as expected
 Make random changes, program never runs again.
 Takes 30+ minutes to realize that writeln("my string") is fine, 
 but writeln("my string " ~ value) is an allocation / garbage 
 collection which crashes the program without a stack.
My favorite D'ism so far:
Nov 20 2018
prev sibling parent reply NoMoreBugs <NoMoreBugs gmail.com> writes:
On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?
The mistake is immediately obvious when you run the program, so I just don't see it as a big deal. You lose a matter of seconds, realize the mistake, and fix it. What is your proposal for handling it? The ones usually put around are kinda a pain to use.
How hard would it be, really, for the compiler to determine that c was never assigned to, and produce a compile time error: "c is never assigned to, and will always have its default value null" That doesn't sound that hard to me.
Nov 21 2018
next sibling parent reply Alex <sascha.orlov gmail.com> writes:
On Wednesday, 21 November 2018 at 10:47:35 UTC, NoMoreBugs wrote:
 On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe 
 wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?
The mistake is immediately obvious when you run the program, so I just don't see it as a big deal. You lose a matter of seconds, realize the mistake, and fix it. What is your proposal for handling it? The ones usually put around are kinda a pain to use.
How hard would it be, really, for the compiler to determine that c was never assigned to, and produce a compile time error: "c is never assigned to, and will always have its default value null" That doesn't sound that hard to me.
Am I misled, or isn't this impossible by design? ´´´ import std.stdio; import std.random; class C { size_t dummy; final void baz() { if(this is null) { writeln(42); } else { writeln(dummy); } } } void main() { C c; c.foo; } void foo(ref C c) { if(uniform01 < 0.5) { c = new C(); c.dummy = unpredictableSeed; } c.baz; } ´´´
Nov 21 2018
parent reply Kagamin <spam here.lot> writes:
On Wednesday, 21 November 2018 at 11:53:14 UTC, Alex wrote:
 Am I misled, or isn't this impossible by design?

 ´´´
 import std.stdio;
 import std.random;

 class C
 {
 	size_t dummy;
 	final void baz()
 	{
 		if(this is null)
 		{
 			writeln(42);
 		}
 		else
 		{
 			writeln(dummy);
 		}
 	}
 }
 void main()
 {
 	C c;
 	c.foo;
 }

 void foo(ref C c)
 {
 	if(uniform01 < 0.5)
 	{
 		c = new C();
 		c.dummy = unpredictableSeed;
 	}
 	c.baz;
 }
 ´´´
would reject to call function foo.
Nov 21 2018
parent reply Alex <sascha.orlov gmail.com> writes:
On Wednesday, 21 November 2018 at 14:21:44 UTC, Kagamin wrote:
 A value passed to ref parameter is assumed to be initialized. 

This was not my point. I wonder, whether the case, where the compiler can't figure out the initialization state of an object is so hard to construct. ´´´ import std.experimental.all; class C { size_t dummy; final void baz() { if(this is null) { writeln(42); } else { writeln(dummy); } } } void main() { C c; if(uniform01 < 0.5) { c = new C(); c.dummy = unpredictableSeed; } else { c = null; } c.baz; writeln(c is null); } ´´´
Nov 21 2018
next sibling parent reply Neia Neutuladh <neia ikeran.org> writes:
On Wed, 21 Nov 2018 17:00:29 +0000, Alex wrote:

in D, it compiles and runs and doesn't segfault.
Nov 21 2018
parent reply Alex <sascha.orlov gmail.com> writes:
On Wednesday, 21 November 2018 at 17:09:54 UTC, Neia Neutuladh 
wrote:
 On Wed, 21 Nov 2018 17:00:29 +0000, Alex wrote:

object), but in D, it compiles and runs and doesn't segfault.
No, it wouldn't. And it doesn't. using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; using System.Diagnostics; namespace ConsoleApp1 { sealed class C { public int dummy; public void baz() { if (this is null) { Debug.WriteLine(42); } else { Debug.WriteLine(dummy); } } } class Program { static void Main(string[] args) { C c; Random random = new Random(4); int randomNumber = random.Next(0, 100); if (randomNumber < 50) { c = new C { dummy = 73 }; } else { c = null; } c.baz(); } } } ´´´ compiled against 4.6.1 Framework. However, of course, there is a NullReferenceException, if c happens to be null, when calling baz. So the difference is not the compiler behavior, but just the runtime behavior... How could the compiler know the state of Random anyway, before the program run.
Nov 21 2018
parent reply aliak <something something.com> writes:
On Wednesday, 21 November 2018 at 17:46:29 UTC, Alex wrote:
 compiled against 4.6.1 Framework.

 However, of course, there is a NullReferenceException, if c 
 happens to be null, when calling baz.

 So the difference is not the compiler behavior, but just the 
 runtime behavior...

 How could the compiler know the state of Random anyway, before 
 the program run.
The compiler would not be able to prove that something was but swift certainly does: class C { func baz() {} } func f() { var x: C if Int.random(in: 0 ..< 10) < 5 { x = C() } x.baz() } error: variable 'x' used before being initialized
Nov 21 2018
parent reply Alex <sascha.orlov gmail.com> writes:
On Wednesday, 21 November 2018 at 21:05:37 UTC, aliak wrote:
 On Wednesday, 21 November 2018 at 17:46:29 UTC, Alex wrote:
 compiled against 4.6.1 Framework.

 However, of course, there is a NullReferenceException, if c 
 happens to be null, when calling baz.

 So the difference is not the compiler behavior, but just the 
 runtime behavior...

 How could the compiler know the state of Random anyway, before 
 the program run.
The compiler would not be able to prove that something was it but swift certainly does: class C { func baz() {} } func f() { var x: C if Int.random(in: 0 ..< 10) < 5 { x = C() } x.baz() } error: variable 'x' used before being initialized
Nice! Didn't know that... But the language is a foreign one for me. Nevertheless, from what I saw: Shouldn't it be var x: C? as an optional kind, because otherwise, I can't assign a nil to the instance, which I can do to a class instance in D... :) ) Comparing non-optional types from swift with classes in D is... yeah... hmm... evil ;) And if you assume a kind which cannot be nil, then you are again with structs here... But I wondered about something different: Even if the compiler would check the existence of an assignment, the runtime information cannot be deduced, if I understand this correctly. And if so, it cannot be checked at compile time, if something is or is not null. Right?
Nov 21 2018
parent aliak <something something.com> writes:
On Wednesday, 21 November 2018 at 23:27:25 UTC, Alex wrote:
 Nice! Didn't know that... But the language is a foreign one for 
 me.

 Nevertheless, from what I saw:
 Shouldn't it be
 var x: C?
 as an optional kind, because otherwise, I can't assign a nil to 
 the instance, which I can do to a class instance in D...

 out! :) )
This is true. But then the difference is that you can't* call a method on an optional variable without first unwrapping it (which is enforced at compile time as well). * You can force unwrap it and then you'd get a segfault if it there was nothing inside the optional. But most times if you see someone force unwrapping an optional it's a code smell in swift.
 Comparing non-optional types from swift with classes in D is... 
 yeah... hmm... evil ;)
Hehe, maybe in a way. Was just trying to show that compilers can fix the null reference "problem" at compile time. And that flow analysis can detect initialization.
 And if you assume a kind which cannot be nil, then you are 
 again with structs here...

 But I wondered about something different:
 Even if the compiler would check the existence of an 
 assignment, the runtime information cannot be deduced, if I 
 understand this correctly. And if so, it cannot be checked at 
 compile time, if something is or is not null. Right?
Aye. But depending on how a language is designed this problem - if you think it is one - can be dealt with. It's why swift has optionals built in to the language.
Nov 22 2018
prev sibling parent Kagamin <spam here.lot> writes:
On Wednesday, 21 November 2018 at 17:00:29 UTC, Alex wrote:
 This was not my point. I wonder, whether the case, where the 
 compiler can't figure out the initialization state of an object 
 is so hard to construct.

 ´´´
 import std.experimental.all;

 class C
 {
 	size_t dummy;
 	final void baz()
 	{
 		if(this is null)
 		{
 			writeln(42);
 		}
 		else
 		{
 			writeln(dummy);
 		}
 	}
 }
 void main()
 {
 	C c;
 	if(uniform01 < 0.5)
 	{
 		c = new C();
 		c.dummy = unpredictableSeed;
 	}
         else
         {
                 c = null;
         }
 	c.baz;
 	writeln(c is null);
 }
 ´´´


As `c` is initialized in both branches, compiler knows it's always in initialized state after the if statement.
Nov 22 2018
prev sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 21 November 2018 at 10:47:35 UTC, NoMoreBugs wrote:
 On Monday, 19 November 2018 at 21:39:22 UTC, Adam D. Ruppe 
 wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 What's the reasoning for allowing this?
The mistake is immediately obvious when you run the program, so I just don't see it as a big deal. You lose a matter of seconds, realize the mistake, and fix it. What is your proposal for handling it? The ones usually put around are kinda a pain to use.
How hard would it be, really, for the compiler to determine that c was never assigned to, and produce a compile time error: "c is never assigned to, and will always have its default value null" That doesn't sound that hard to me.
For _TRIVIAL_cases this is not hard. But we cannot only worry about trivial cases; We have to consider _all_ cases. Therefore we better not emit an error in a trivial case. Which could lead users to assume that we are detecting all the cases. That in turn will give the impression of an unreliable system, and indeed that impression would not be too far from the truth.
Nov 21 2018
parent NoMoreBugs <NoMoreBugs gmail.com> writes:
On Wednesday, 21 November 2018 at 17:11:23 UTC, Stefan Koch wrote:
 For _TRIVIAL_cases this is not hard.

 But we cannot only worry about trivial cases;
 We have to consider _all_ cases.

 Therefore we better not emit an error in a trivial case.
 Which could lead users to assume that we are detecting all the 
 cases.
 That in turn will give the impression of an unreliable system, 
 and indeed that impression would not be too far from the truth.
On the face of it, that seems a reasonable argument. i.e. Consistency. On the other-hand, I see nothing 'reliable' about handing off the responsibility of detecting run-time errors, to the o/s ;-) I would prefer to catch these errors at compile time, or run time. D can do neither it seems.
Nov 21 2018
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/19/18 4:23 PM, Jordi Gutiérrez Hermoso wrote:
 When I was first playing with D, I managed to create a segfault by doing 
 `SomeClass c;` and then trying do something with the object I thought I 
 had default-created, by analogy with C++ syntax. Seasoned D programmers 
 will recognise that I did nothing of the sort and instead created c is 
 null and my program ended up dereferencing a null pointer.
 
 I'm not the only one who has done this. I can't find it right now, but 
 I've seen at least one person open a bug report because they 
 misunderstood this as a bug in dmd.
 
 I have been told a couple of times that this isn't something that needs 
 to be patched in the language, but I don't understand. It seems like a 
 very easy way to generate a segfault (and not a NullPointerException or 
 whatever).
 
 What's the reasoning for allowing this?
A null pointer dereference is an immediate error, and it's also a safe error. It does not cause corruption, and it is free (the MMU is doing it for you). Note, you can get a null pointer exception on Linux by using etc.linux.memoryerror: https://github.com/dlang/druntime/blob/master/src/etc/linux/memoryerror.d The worst part about a null-pointer segfault is when it's intermittent and you get no information about where it happens. Then it can be annoying to track down. But it can't be used as an exploit. Consistent segfaults are generally easy to figure out. -Steve
Nov 19 2018
parent reply Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso <jordigh octave.org> writes:
On Monday, 19 November 2018 at 21:52:47 UTC, Steven Schveighoffer 
wrote:

 A null pointer dereference is an immediate error, and it's also 
 a safe error. It does not cause corruption, and it is free (the 
 MMU is doing it for you).
Is this always true for all arches that D can compile to? I remember back in the DOS days with no memory protection you really could read OS data around the beginning.
 Consistent segfaults are generally easy to figure out.
I think I would still prefer a stack trace like other kinds of D errors. Is this too difficult?
Nov 19 2018
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/19/18 7:21 PM, Jordi Gutiérrez Hermoso wrote:
 On Monday, 19 November 2018 at 21:52:47 UTC, Steven Schveighoffer wrote:
 
 A null pointer dereference is an immediate error, and it's also a safe 
 error. It does not cause corruption, and it is free (the MMU is doing 
 it for you).
Is this always true for all arches that D can compile to? I remember back in the DOS days with no memory protection you really could read OS data around the beginning.
It's true for all OSes that D supports, and for most modern operating systems, that run in protected mode. It would NOT necessarily be true for kernel modules or an OS kernel, so that is something to be concerned about.
 Consistent segfaults are generally easy to figure out.
I think I would still prefer a stack trace like other kinds of D errors. Is this too difficult?
Yes and no. It's good to remember that this is a HARDWARE generated exception, and each OS handles it differently. It's also important to remember that a segmentation fault is NOT necessarily the result of a simple error like forgetting to initialize a variable. It could be a serious memory corruption error. Generating stack traces can be dangerous in this kind of state. As I said, on Linux you can enable a "hack" that generates an error for a null dereference. On Windows, I believe that it already generates an exception without any modification. On other OSes you may be out of luck until someone figures out a nice clever hack for it. And if it's repeatable, you can always run in a debugger to see where the error is occurring. -Steve
Nov 19 2018
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, November 19, 2018 5:30:00 PM MST Steven Schveighoffer via 
Digitalmars-d-learn wrote:
 On 11/19/18 7:21 PM, Jordi Gutirrez Hermoso wrote:
 On Monday, 19 November 2018 at 21:52:47 UTC, Steven Schveighoffer wrote:
 A null pointer dereference is an immediate error, and it's also a safe
 error. It does not cause corruption, and it is free (the MMU is doing
 it for you).
Is this always true for all arches that D can compile to? I remember back in the DOS days with no memory protection you really could read OS data around the beginning.
It's true for all OSes that D supports, and for most modern operating systems, that run in protected mode. It would NOT necessarily be true for kernel modules or an OS kernel, so that is something to be concerned about.
For safe to function properly, dereferencing null _must_ be guaranteed to be memory safe, and for dmd it is, since it will always segfault. Unfortunately, as understand it, it is currently possible with ldc's optimizer to run into trouble, since it'll do things like see that something must be null and therefore assume that it must never be dereferenced, since it would clearly be wrong to dereference it. And then when the code hits a point where it _does_ try to dereference it, you get undefined behavior. It's something that needs to be fixed in ldc, but based on discussions I had with Johan at dconf this year about the issue, I suspect that the spec is going to have to be updated to be very clear on how dereferencing null has to be handled before the ldc guys do anything about it. As long as the optimizer doesn't get involved everything is fine, but as great as optimizers can be at making code faster, they aren't really written with stuff like safe in mind.
 Consistent segfaults are generally easy to figure out.
I think I would still prefer a stack trace like other kinds of D errors. Is this too difficult?
Yes and no. It's good to remember that this is a HARDWARE generated exception, and each OS handles it differently. It's also important to remember that a segmentation fault is NOT necessarily the result of a simple error like forgetting to initialize a variable. It could be a serious memory corruption error. Generating stack traces can be dangerous in this kind of state. As I said, on Linux you can enable a "hack" that generates an error for a null dereference. On Windows, I believe that it already generates an exception without any modification. On other OSes you may be out of luck until someone figures out a nice clever hack for it. And if it's repeatable, you can always run in a debugger to see where the error is occurring.
Also, if your OS supports core dumps, and you have them turned on, then it's trivial to get a stack trace - as well as a lot more of the program state. - Jonathan M Davis
Nov 19 2018
next sibling parent reply Johan Engelen <j j.nl> writes:
On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis 
wrote:
 For  safe to function properly, dereferencing null _must_ be 
 guaranteed to be memory safe, and for dmd it is, since it will 
 always segfault. Unfortunately, as understand it, it is 
 currently possible with ldc's optimizer to run into trouble, 
 since it'll do things like see that something must be null and 
 therefore assume that it must never be dereferenced, since it 
 would clearly be wrong to dereference it. And then when the 
 code hits a point where it _does_ try to dereference it, you 
 get undefined behavior. It's something that needs to be fixed 
 in ldc, but based on discussions I had with Johan at dconf this 
 year about the issue, I suspect that the spec is going to have 
 to be updated to be very clear on how dereferencing null has to 
 be handled before the ldc guys do anything about it. As long as 
 the optimizer doesn't get involved everything is fine, but as 
 great as optimizers can be at making code faster, they aren't 
 really written with stuff like  safe in mind.
One big problem is the way people talk and write about this issue. There is a difference between "dereferencing" in the language, and reading from a memory address by the CPU. Confusing language semantics with what the CPU is doing happens often in the D community and is not helping these debates. D is proclaiming that dereferencing `null` must segfault but that is not implemented by any of the compilers. It would require inserting null checks upon every dereference. (This may not be as slow as you may think, but it would probably not make code run faster.) An example: ``` class A { int i; final void foo() { import std.stdio; writeln(__LINE__); // i = 5; } } void main() { A a; a.foo(); } ``` In this case, the actual null dereference happens on the last line of main. The program runs fine however since dlang 2.077. Now when `foo` is modified such that it writes to member field `i`, the program does segfault (writes to address 0). D does not make dereferencing on class objects explicit, which makes it harder to see where the dereference is happening. So, I think all compiler implementations are not spec compliant on this point. I think most people believe that compliance is too costly for the kind of software one wants to write in D; the issue is similar to array bounds checking that people explicitly disable or work around. For compliance we would need to change the compiler to emit null checks on all safe dereferences (the opposite direction was chosen in 2.077). It'd be interesting to do the experiment. -Johan
Nov 20 2018
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/20/18 1:04 PM, Johan Engelen wrote:
 On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis wrote:
 For  safe to function properly, dereferencing null _must_ be 
 guaranteed to be memory safe, and for dmd it is, since it will always 
 segfault. Unfortunately, as understand it, it is currently possible 
 with ldc's optimizer to run into trouble, since it'll do things like 
 see that something must be null and therefore assume that it must 
 never be dereferenced, since it would clearly be wrong to dereference 
 it. And then when the code hits a point where it _does_ try to 
 dereference it, you get undefined behavior. It's something that needs 
 to be fixed in ldc, but based on discussions I had with Johan at dconf 
 this year about the issue, I suspect that the spec is going to have to 
 be updated to be very clear on how dereferencing null has to be 
 handled before the ldc guys do anything about it. As long as the 
 optimizer doesn't get involved everything is fine, but as great as 
 optimizers can be at making code faster, they aren't really written 
 with stuff like  safe in mind.
One big problem is the way people talk and write about this issue. There is a difference between "dereferencing" in the language, and reading from a memory address by the CPU.
In general, I always consider "dereferencing" the point at which code follows a pointer to read or write its data. The semantics of modifying the type to mean the data vs. the pointer to it, seems less interesting. Types are compiler internal things, the actual reads and writes are what cause the problems. But really, it's the act of using a pointer to read/write the data it points at which causes the segfault. And in D, we assume that this action is safe because of the MMU protecting the first page.
 Confusing language semantics with what the CPU is doing happens often in 
 the D community and is not helping these debates.
 
 D is proclaiming that dereferencing `null` must segfault but that is not 
 implemented by any of the compilers. It would require inserting null 
 checks upon every dereference. (This may not be as slow as you may 
 think, but it would probably not make code run faster.)
 
 An example:
 ```
 class A {
      int i;
      final void foo() {
           import std.stdio; writeln(__LINE__);
          // i = 5;
      }
 }
 
 void main() {
      A a;
      a.foo();
 }
 ```
 
 In this case, the actual null dereference happens on the last line of 
 main. The program runs fine however since dlang 2.077.
Right, the point is that the segfault happens when null pointers are used to get at the data. If you turn something that is ultimately a pointer into another type of pointer, then you aren't dereferencing it really. This happens when you pass *pointer into a function that takes a reference (or when you pass around a class reference). In any case, the prior versions to 2.077 didn't segfault, they just had a prelude in front of every function which asserted that this wasn't null (you actually get a nice stack trace).
 Now when `foo` is modified such that it writes to member field `i`, the 
 program does segfault (writes to address 0).
 D does not make dereferencing on class objects explicit, which makes it 
 harder to see where the dereference is happening.
Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at.
 
 So, I think all compiler implementations are not spec compliant on this 
 point.
I think if the spec says that dereferencing doesn't mean following a pointer to it's data, and reading/writing that data, and it says null dereferences cause a segfault, then the spec needs to be updated. The safe segfault is what it should be focused on, not some abstract concept that exists only in the compiler. If it means changing the terminology, then we should do that.
 I think most people believe that compliance is too costly for the kind 
 of software one wants to write in D; the issue is similar to array 
 bounds checking that people explicitly disable or work around.
 For compliance we would need to change the compiler to emit null checks 
 on all  safe dereferences (the opposite direction was chosen in 2.077). 
 It'd be interesting to do the experiment.
The whole point of using the MMU instead of instrumentation is because we can avoid the performance penalties and still be safe. The only loophole is large structures that may extend beyond the protected data. I would suggest that the compiler inject extra reads of the front of any data type in that case (when safe is enabled) to cause a segfault properly. -Steve
Nov 20 2018
next sibling parent reply Johan Engelen <j j.nl> writes:
On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, which 
 makes it harder to see where the dereference is happening.
Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at.
But `a.foo()` is already using the object's data: it is accessing a function of the object and calling it. Whether it is a virtual function, or a final function, that shouldn't matter. There are different ways of implementing class function calls, but here often people seem to pin things down to one specific way. I feel I stand alone in the D community in treating the language in this abstract sense (like C and C++ do, other languages I don't know). It's similar to that people think that local variables and the function return address are put on a stack; even though that is just an implementation detail that is free to be changed (and does often change: local variables are regularly _not_ stored on the stack [*]). Optimization isn't allowed to change behavior of a program, yet already simple dead-code-elimination would when null dereference is not treated as UB or when it is not guarded by a null check. Here is an example of code that also does what you call a "dereference" (read object data member): ``` class A { int i; final void foo() { int a = i; // no crash with -O } } void main() { A a; a.foo(); // dereference happens } ``` When you don't call `a.foo()` a dereference, you basically say that `this` is allowed to be `null` inside a class member function. (and then it'd have to be normal to do `if (this) ...` inside class member functions...) These discussions are hard to do on a mailinglist, so I'll stop here. Until next time at DConf, I suppose... ;-) -Johan [*] intentionally didn't say where those local variables _are_ stored, so that people can solve that little puzzle for themselves ;-)
Nov 20 2018
next sibling parent reply Neia Neutuladh <neia ikeran.org> writes:
On Tue, 20 Nov 2018 23:14:27 +0000, Johan Engelen wrote:
 When you don't call `a.foo()` a dereference, you basically say that
 `this` is allowed to be `null` inside a class member function. (and then
 it'd have to be normal to do `if (this) ...` inside class member
 functions...)
That's what we have today: module scratch; import std.stdio; class A { int i; final void f() { writeln(this is null); writeln(i); } } void main() { A a; a.f(); } This prints `true` and then gets a segfault. Virtual function calls have to do a dereference to figure out which potentially overrided function to call.
Nov 20 2018
parent Johan Engelen <j j.nl> writes:
On Wednesday, 21 November 2018 at 03:05:07 UTC, Neia Neutuladh 
wrote:
 Virtual function calls have to do a dereference to figure out 
 which potentially overrided function to call.
"have to do a dereference" in terms of "dereference" as language semantic: yes. "have to do a dereference" in terms of "dereference" as reading from memory: no. If you have proof of the runtime type of an object, then you can use that information to have the CPU call the overrided function directly without reading from memory. -Johan
Nov 21 2018
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 20 November 2018 at 23:14:27 UTC, Johan Engelen wrote:
 On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
 Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, 
 which makes it harder to see where the dereference is 
 happening.
Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at.
But `a.foo()` is already using the object's data: it is accessing a function of the object and calling it. Whether it is a virtual function, or a final function, that shouldn't matter.
It matters a lot. A virtual function is a pointer that is in the instance, so there is a derefernce of the this pointer to get the address of the function. For a final function, the address of the function is known at compile time and no dereferencing is necessary. That is a thing that a lot of people do not get, a member function and a plain function are basically the same thing. What distinguishes them, is their mangled name. You can call a non virtual member function from an assembly source if you know the symbol name. UFCS uses this fact, that member function and plain function are indistinguishable in a object code point of view, to fake member functions.
 There are different ways of implementing class function calls, 
 but here often people seem to pin things down to one specific 
 way. I feel I stand alone in the D community in treating the 
 language in this abstract sense (like C and C++ do, other 
 languages I don't know). It's similar to that people think that 
 local variables and the function return address are put on a 
 stack; even though that is just an implementation detail that 
 is free to be changed (and does often change: local variables 
 are regularly _not_ stored on the stack [*]).

 Optimization isn't allowed to change behavior of a program, yet 
 already simple dead-code-elimination would when null 
 dereference is not treated as UB or when it is not guarded by a 
 null check. Here is an example of code that also does what you 
 call a "dereference" (read object data member):
 ```
 class A {
     int i;
     final void foo() {
         int a = i; // no crash with -O
     }
 }

 void main() {
     A a;
     a.foo();  // dereference happens
 }
No. There's no dereferencing. foo does nothing visible and can be replaced by a NOP. For the call, no dereferencing required.
 ```

 When you don't call `a.foo()` a dereference, you basically say
Again, no dereferencing for a (final) function call. `a.foo()` is the same thing as `foo(a)` by reverse UFCS. The generated code is identical. It is only the compiler that will use different mangled names.
 that `this` is allowed to be `null` inside a class member 
 function. (and then it'd have to be normal to do `if (this) 
 ...` inside class member functions...)

 These discussions are hard to do on a mailinglist, so I'll stop 
 here. Until next time at DConf, I suppose... ;-)

 -Johan

 [*] intentionally didn't say where those local variables _are_ 
 stored, so that people can solve that little puzzle for 
 themselves ;-)
Nov 21 2018
parent Johan Engelen <j j.nl> writes:
On Wednesday, 21 November 2018 at 09:31:41 UTC, Patrick Schluter 
wrote:
 On Tuesday, 20 November 2018 at 23:14:27 UTC, Johan Engelen 
 wrote:
 On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
 Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, 
 which makes it harder to see where the dereference is 
 happening.
Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at.
But `a.foo()` is already using the object's data: it is accessing a function of the object and calling it. Whether it is a virtual function, or a final function, that shouldn't matter.
It matters a lot. A virtual function is a pointer that is in the instance, so there is a derefernce of the this pointer to get the address of the function. For a final function, the address of the function is known at compile time and no dereferencing is necessary. That is a thing that a lot of people do not get, a member function and a plain function are basically the same thing. What distinguishes them, is their mangled name. You can call a non virtual member function from an assembly source if you know the symbol name. UFCS uses this fact, that member function and plain function are indistinguishable in a object code point of view, to fake member functions.
This and the rest of your email is exactly the kind of thinking that I oppose where language semantics and compiler implementation are being mixed. I don't think it's possible to write an optimizing compiler where that way of reasoning works. So D doesn't do that, and we have to treat language semantics separate from implementation details. (virtual functions don't have to be implemented using vtables, local variables don't have to be on a stack, "a+b" does not need to result in a CPU add instruction, "foo()" does not need to result in a CPU procedure call instruction, etc, etc, etc. D is not a portable assembly language.) -Johan
Nov 21 2018
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/20/18 6:14 PM, Johan Engelen wrote:
 On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven Schveighoffer wrote:
 On 11/20/18 1:04 PM, Johan Engelen wrote:
 D does not make dereferencing on class objects explicit, which makes 
 it harder to see where the dereference is happening.
Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at.
But `a.foo()` is already using the object's data: it is accessing a function of the object and calling it. Whether it is a virtual function, or a final function, that shouldn't matter. There are different ways of implementing class function calls, but here often people seem to pin things down to one specific way. I feel I stand alone in the D community in treating the language in this abstract sense (like C and C++ do, other languages I don't know). It's similar to that people think that local variables and the function return address are put on a stack; even though that is just an implementation detail that is free to be changed (and does often change: local variables are regularly _not_ stored on the stack [*]). Optimization isn't allowed to change behavior of a program, yet already simple dead-code-elimination would when null dereference is not treated as UB or when it is not guarded by a null check. Here is an example of code that also does what you call a "dereference" (read object data member): ``` class A {     int i;     final void foo() {         int a = i; // no crash with -O     } } void main() {     A a;     a.foo();  // dereference happens } ```
I get what you are saying. But in terms of memory safety *both results* are safe. The one where the code is eliminated is safe, and the one where the segfault happens is safe. This is a tricky area, because D depends on a hardware feature for language correctness. In other words, it's perfectly possible for a null read or write to not result in a segfault, which would make D's allowance of dereferencing a null object without checking for null actually unsafe (now it's just another dangling pointer). In terms of language semantics, I don't know what the right answer is. If we want to say that if an optimizer changes program behavior, the code must be UB, then this would have to be UB. But I would prefer saying something like -- if a segfault occurs and the program continues, the system is in UB-land, but otherwise, it's fine. If this means an optimized program runs and a non-optimized one crashes, then that's what it means. I'd be OK with that result. It's like Schrodinger's segfault! I don't know what it means in terms of compiler assumptions, so that's where my ignorance will likely get me in trouble :)
 These discussions are hard to do on a mailinglist, so I'll stop here. 
 Until next time at DConf, I suppose... ;-)
Maybe that is a good time to discuss for learning how things work. But clearly people would like to at least have a say here. I still feel like using the hardware to deal with null access is OK, and a hard-crash is the best result for something that clearly would be UB otherwise. -Steve
Nov 22 2018
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 22.11.18 16:19, Steven Schveighoffer wrote:
 
 In terms of language semantics, I don't know what the right answer is. 
 If we want to say that if an optimizer changes program behavior, the 
 code must be UB, then this would have to be UB.
 
 But I would prefer saying something like -- if a segfault occurs and the 
 program continues, the system is in UB-land, but otherwise, it's fine. 
 If this means an optimized program runs and a non-optimized one crashes, 
 then that's what it means. I'd be OK with that result. It's like 
 Schrodinger's segfault!
 
 I don't know what it means in terms of compiler assumptions, so that's 
 where my ignorance will likely get me in trouble :)
This is called nondeterministic semantics, and it is a good idea if you want both efficiency and memory safety guarantees, but I don't know how well our backends would support it. (However, I think it is necessary anyway, e.g. to give semantics to pure functions.)
Dec 03 2018
prev sibling parent NoMoreBugs <NoMoreBugs gmail.com> writes:
On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven 
Schveighoffer wrote:
 But really, it's the act of using a pointer to read/write the 
 data it points at which causes the segfault. And in D, we 
 assume that this action is  safe because of the MMU protecting 
 the first page.
This is like me saying I won't bother locking up when I leave the house, cause if the alarm goes off the security company will come around and take care of things anyway. But by then, it's too late.
 D is proclaiming that dereferencing `null` must segfault but 
 that is not implemented by any of the compilers. It would 
 require inserting null checks upon every dereference. (This 
 may not be as slow as you may think, but it would probably not 
 make code run faster.)
Aristotle would have immediately solved this dilemma. Null is a valid value for reference types. Dereferencing null can lead to bad things happening. Therefore, check for null before dereferencing a reference type. Problem solved.
Nov 20 2018
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, November 20, 2018 11:04:08 AM MST Johan Engelen via Digitalmars-
d-learn wrote:
 On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis

 wrote:
 For  safe to function properly, dereferencing null _must_ be
 guaranteed to be memory safe, and for dmd it is, since it will
 always segfault. Unfortunately, as understand it, it is
 currently possible with ldc's optimizer to run into trouble,
 since it'll do things like see that something must be null and
 therefore assume that it must never be dereferenced, since it
 would clearly be wrong to dereference it. And then when the
 code hits a point where it _does_ try to dereference it, you
 get undefined behavior. It's something that needs to be fixed
 in ldc, but based on discussions I had with Johan at dconf this
 year about the issue, I suspect that the spec is going to have
 to be updated to be very clear on how dereferencing null has to
 be handled before the ldc guys do anything about it. As long as
 the optimizer doesn't get involved everything is fine, but as
 great as optimizers can be at making code faster, they aren't
 really written with stuff like  safe in mind.
One big problem is the way people talk and write about this issue. There is a difference between "dereferencing" in the language, and reading from a memory address by the CPU. Confusing language semantics with what the CPU is doing happens often in the D community and is not helping these debates. D is proclaiming that dereferencing `null` must segfault but that is not implemented by any of the compilers. It would require inserting null checks upon every dereference. (This may not be as slow as you may think, but it would probably not make code run faster.) An example: ``` class A { int i; final void foo() { import std.stdio; writeln(__LINE__); // i = 5; } } void main() { A a; a.foo(); } ``` In this case, the actual null dereference happens on the last line of main. The program runs fine however since dlang 2.077. Now when `foo` is modified such that it writes to member field `i`, the program does segfault (writes to address 0). D does not make dereferencing on class objects explicit, which makes it harder to see where the dereference is happening.
Yeah. It's one of those areas where the spec will need to be clear. Like C++, D doesn't actually dereference unless it needs to. And IMHO, that's fine. The core issue is that operations that aren't memory safe can't be allowed to happen in safe code, and the spec needs to be defined in such a way that requires that that be true, though not necessarily by being super specific about every detail about how a compiler is required to do it.
 So, I think all compiler implementations are not spec compliant
 on this point.
 I think most people believe that compliance is too costly for the
 kind of software one wants to write in D; the issue is similar to
 array bounds checking that people explicitly disable or work
 around.
 For compliance we would need to change the compiler to emit null
 checks on all  safe dereferences (the opposite direction was
 chosen in 2.077). It'd be interesting to do the experiment.
Ultimately here, the key thing is that it must be guaranteed that dereferencing null is safe in safe code (regardless of whether that involves * or . and regardless of how that is achieved). It must never read from or write to invalid memory. If it can, then dereferencing a null pointer or class reference is not memory safe, and since there's no way to know whether a pointer or class reference is null or not via the type system, dereferencing pointers and references in general would then be system, and that simply can't be the case, or safe is completely broken. Typically, that protection is done right now via segfaults, but we know that that's not always possible. For instance, if the object is large enough (larger than one page size IIRC), then attempting to dereference a null pointer won't necessarily segfault. It can actually end up accessing invalid memory if you try to access a member variable that's deep enough in the object. I know that in that particular case, Walter's answer to the problem is that such objects should be illegal in safe code, but AFAIK, neither the compiler nor the spec have yet been updated to match that decision, which needs to be fixed. But regardless, in any and all cases where we determine that a segfault won't necessarily protect against accessing invalid memory when a null pointer or reference is dereferenced, then we need to do _something_ to guarantee that that code is safe - which probably means adding additional null checks in most cases, though in the case of the overly large object, Walter has a different solution. IMHO, requiring something in the spec like "it must segfault when dereferencing null" as has been suggested before is probably not a good idea is really getting too specific (especially considering that some folks have argued that not all architectures segfault like x86 does), but ultimately, the question needs to be discussed with Walter. I did briefly discuss it with him at this last dconf, but I don't recall exactly what he had to say about the ldc optimization stuff. I _think_ that he was hoping that there was a way to tell the optimizer to just not do that kind of optimization, but I don't remember for sure. Ultimately, the two of you will probably have to discuss it. Either way, I know that he wanted a bugzilla issue on the topic, but I keep forgetting about it. First, I need to at least dig through the spec to figure out what it actually says right now, which probably isn't much. - Jonathan M Davis
Nov 20 2018
parent reply Johan Engelen <j j.nl> writes:
On Wednesday, 21 November 2018 at 07:47:14 UTC, Jonathan M Davis 
wrote:
 IMHO, requiring something in the spec like "it must segfault 
 when dereferencing null" as has been suggested before is 
 probably not a good idea is really getting too specific 
 (especially considering that some folks have argued that not 
 all architectures segfault like x86 does), but ultimately, the 
 question needs to be discussed with Walter. I did briefly 
 discuss it with him at this last dconf, but I don't recall 
 exactly what he had to say about the ldc optimization stuff. I 
 _think_ that he was hoping that there was a way to tell the 
 optimizer to just not do that kind of optimization, but I don't 
 remember for sure.
The issue is not specific to LDC at all. DMD also does optimizations that assume that dereferencing [*] null is UB. The example I gave is dead-code-elimination of a dead read of a member variable inside a class method, which can only be done either if the spec says that`a.foo()` is UB when `a` is null, or if `this.a` is UB when `this` is null. [*] I notice you also use "dereference" for an execution machine [**] reading from a memory address, instead of the language doing a dereference (which may not necessarily mean a read from memory). [**] intentional weird name for the CPU? Yes. We also have D code running as webassembly... -Johan
Nov 21 2018
next sibling parent Kagamin <spam here.lot> writes:
On Wednesday, 21 November 2018 at 22:24:06 UTC, Johan Engelen 
wrote:
 The issue is not specific to LDC at all. DMD also does 
 optimizations that assume that dereferencing [*] null is UB.
Do you have an example? I think it treats null dereference as implementation defined but otherwise safe.
Nov 22 2018
prev sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, November 21, 2018 3:24:06 PM MST Johan Engelen via 
Digitalmars-d-learn wrote:
 On Wednesday, 21 November 2018 at 07:47:14 UTC, Jonathan M Davis

 wrote:
 IMHO, requiring something in the spec like "it must segfault
 when dereferencing null" as has been suggested before is
 probably not a good idea is really getting too specific
 (especially considering that some folks have argued that not
 all architectures segfault like x86 does), but ultimately, the
 question needs to be discussed with Walter. I did briefly
 discuss it with him at this last dconf, but I don't recall
 exactly what he had to say about the ldc optimization stuff. I
 _think_ that he was hoping that there was a way to tell the
 optimizer to just not do that kind of optimization, but I don't
 remember for sure.
The issue is not specific to LDC at all. DMD also does optimizations that assume that dereferencing [*] null is UB. The example I gave is dead-code-elimination of a dead read of a member variable inside a class method, which can only be done either if the spec says that`a.foo()` is UB when `a` is null, or if `this.a` is UB when `this` is null. [*] I notice you also use "dereference" for an execution machine [**] reading from a memory address, instead of the language doing a dereference (which may not necessarily mean a read from memory). [**] intentional weird name for the CPU? Yes. We also have D code running as webassembly...
Skipping a dereference of null shouldn't be a problem as far as memory safety goes. The issue is if the compiler decides that UB allows it do to absolutely anything, and it rearranges the code in such a way that invalid memory is accessed. That cannot be allowed in safe code in any D compiler. The code doesn't need to actually segfault, but it absolutely cannot access invalid memory even when optimized. Whether dmd's dead code elimination algorithm is able to make safe code unsafe, I don't know. I'm not familiar with dmd's internals, and in general, while I have a basic understanding of the stuff at the various levels of a compiler, once the discussion gets to stuff like machine instructions and how the optimizer works, my understanding definitely isn't deep. After we discussed this issue with regards to ldc at dconf, I brought it up with Walter, and he didn't seem to think that dmd had such a problem, but I didn't think to raise that particular possibility either. It wouldn't surprise me if dmd also had issues in its optimizer that made safe not safe, and it wouldn't surprise me if it didn't. It's the sort of area where I'd expect that ldc's more aggressive optimizations to be much more likely to run into trouble, and it's more likely to do things that Walter isn't familiar with, but that doesn't mean that Walter didn't miss anything with dmd either. After all, he does seem to like the idea of allowing the optimizer to assume that assertions are true, and as far as I can tell based on discussions on that topic, he doesn't seem to have understood (or maybe just didn't agree) that if we did that, the optimizer can't be allowed to make that assumption if there's any possibility of the code not being memory safe if the assumption is wrong (at least not without violating the guarantees that safe is supposed to provide). Since if the assumption turns out to be wrong (which is quite possible, even if it's not likely in well-tested code), then safe would then violate memory safety. As I understand it, by definition, safe code is supposed to not have undefined behavior in it, and certainly, if any compiler's optimizer takes undefined behavior as meaning that it can do whatever it wants at that point with no restrictions (which is what I gathered from our discussion at dconf), then I don't see how any D compiler's optimizer can be allowed to think that anything is UB in safe code. That may be why Walter was updating various parts of the spec a while back to talk about compiler-defined as opposed to undefined, since there are certainly areas where the compiler can have leeway with what it does, but there are places (at least in safe code), where there must be restrictions on what it can assume and do even when the implementation is given leeway, or safe's memory safety guarantees won't actually be properly guaranteed. In any case, clearly this needs to be sorted out with Walter, and the D spec needs to be updated in whatever manner best fixes the problem. Null pointers / references need to be guaranteed to be safe in safe code. Whether that's going to require that the compiler insert additional null checks in at least some places, I don't know. I simply don't know enough about how things work with stuff like the optimizers, but it wouldn't surprise me if in at least some cases, the compiler is ultimately going to be forced to insert null checks. Certainly, at minimum, I think that it's quite clear that if a platform doesn't segfault like x86 does, then it would have to. - Jonathan M Davis
Nov 22 2018
prev sibling parent reply Tony <tonytdominguez aol.com> writes:
isocpp.org just had a link to a blog post where someone makes a 
case for uninitialized variables in C++ being an advantage in 
that you can potentially get a warning regarding use of an 
uninitialized variable that points out an error in your code.

https://akrzemi1.wordpress.com/2018/11/22/treating-symptoms-instead-of-the-cause/
Nov 30 2018
parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 1 December 2018 at 00:32:35 UTC, Tony wrote:
 isocpp.org just had a link to a blog post where someone makes a 
 case for uninitialized variables in C++ being an advantage in 
 that you can potentially get a warning regarding use of an 
 uninitialized variable that points out an error in your code.

 https://akrzemi1.wordpress.com/2018/11/22/treating-symptoms-instead-of-the-cause/
This is grat when it works, but the problem is that it would be gargantuan effort -and compile time sink- to make it work perfectly. When it's just about if-else if chains, switches or boolean logic as in the example, the analysis won't be too complicated. But swap those booleans out for a string, and make the conditions to test whether it's a phone number, and whether it satisfies some predicate implemented in a foreign langauge, and you'll see where the problem is.
Dec 01 2018
parent reply Tony <tonytdominguez aol.com> writes:
On Saturday, 1 December 2018 at 11:16:49 UTC, Dukc wrote:
 This is great when it works, but the problem is that it would 
 be gargantuan effort -and compile time sink- to make it work 
 perfectly. When it's just about if-else if chains, switches or 
 boolean logic as in the example, the analysis won't be too 
 complicated. But swap those booleans out for a string, and make 
 the conditions to test whether it's a phone number, and whether 
 it satisfies some predicate implemented in a foreign language, 
 and you'll see where the problem is.
I think he is just talking about the compiler or static analyzer seeing if a variable has been given a value before it is used, not if it was given a valid value.
Dec 01 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Dec 01, 2018 at 06:30:05PM +0000, Tony via Digitalmars-d-learn wrote:
 On Saturday, 1 December 2018 at 11:16:49 UTC, Dukc wrote:
 This is great when it works, but the problem is that it would be
 gargantuan effort -and compile time sink- to make it work perfectly.
 When it's just about if-else if chains, switches or boolean logic as
 in the example, the analysis won't be too complicated. But swap
 those booleans out for a string, and make the conditions to test
 whether it's a phone number, and whether it satisfies some predicate
 implemented in a foreign language, and you'll see where the problem
 is.
I think he is just talking about the compiler or static analyzer seeing if a variable has been given a value before it is used, not if it was given a valid value.
But that's precisely the problem. It's not always possible to tell whether a variable has been initialized. E.g.: int func(int x) { int *p; if (solveRiemannHypothesis()) { p = &x; } ... if (solveArtinsConjecture()) { *p++; } return x; } For arbitrarily complex intervening code, determining whether or not a certain code path would take (that would initialize the variable) is equivalent to solving the halting problem, which is undecidable. In the above contrived example, Artin's conjecture is implied by the Riemann hypothesis, so the second if statement would only run if p is initialized. But there is no way the compiler is going to be able to deduce this, especially not during compile time. So it is not possible to correctly flag p as being initialized or not when it is dereferenced. Therefore, leaving it up to the compiler to detect uninitialized variables is unreliable, and therefore any code that depends on this cannot be trusted. Code like the above could be exploited by a sufficiently sophisticated hack to make the uninitialized value of p coincide with something that will open a security hole, and the compiler would not be able to reliably warn the programmer of this problem. Uninitialized variables are *not* a good thing, contrary to what the author of the article might wish to believe. T -- The computer is only a tool. Unfortunately, so is the user. -- Armaphine, K5
Dec 01 2018
next sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Saturday, 1 December 2018 at 19:02:54 UTC, H. S. Teoh wrote:
 But that's precisely the problem. It's not always possible to 
 tell whether a variable has been initialized. E.g.:

 	int func(int x) {
 		int *p;

 		if (solveRiemannHypothesis()) {
 			p = &x;
 		}

 		...

 		if (solveArtinsConjecture()) {
 			*p++;
 		}
 		return x;
 	}
If you are willing to loose some precision you can still analyse this. Google abstract interpretation. For instance, after the first if the value of p is (&x || null). Since the compiler can prove which branch is taken, the analyse has to assume both are. Inside the second if, p gets dereferenced, but since p is (&x || null) - that is, it might be null - that is a compile time error. The take away is that you don't need to know what code path will be taken, you just combine both states.
Dec 01 2018
parent Sebastiaan Koppe <mail skoppe.eu> writes:
On Saturday, 1 December 2018 at 20:41:53 UTC, Sebastiaan Koppe 
wrote:
 Since the compiler can prove which branch is taken, the analyse 
 has to assume both are.
*can't*
Dec 01 2018
prev sibling next sibling parent aliak <something something.com> writes:
On Saturday, 1 December 2018 at 19:02:54 UTC, H. S. Teoh wrote:
 In the above contrived example, Artin's conjecture is implied 
 by the Riemann hypothesis, so the second if statement would 
 only run if p is initialized. But there is no way the compiler 
 is going to be able to deduce this, especially not during 
 compile time. So it is not possible to correctly flag p as 
 being initialized or not when it is dereferenced.

 Therefore, leaving it up to the compiler to detect 
 uninitialized variables is unreliable, and therefore any code 
 that depends on this cannot be trusted. Code like the above 
 could be exploited by a sufficiently sophisticated hack to make 
 the uninitialized value of p coincide with something that will 
 open a security hole, and the compiler would not be able to 
 reliably warn the programmer of this problem.

 Uninitialized variables are *not* a good thing, contrary to 
 what the author of the article might wish to believe.


 T
If a compiler were to issue warnings/error for uninitialized variables. Then that example would be a compiler error. The logic would just be that not all code paths lead to an initialized variable, therefor *p++ is not guaranteed to be initialized - i.e. error. Swift takes this approach. Cheers, - Ali
Dec 02 2018
prev sibling parent Tony <tonytdominguez aol.com> writes:
On Saturday, 1 December 2018 at 19:02:54 UTC, H. S. Teoh wrote:

 But that's precisely the problem. It's not always possible to 
 tell whether a variable has been initialized. E.g.:
To me, the possibility of a "false positive" doesn't preclude the use of a warning unless that possibility is large. Besides using a compiler option or pragma to get rid of it, the warning also goes away if you assign NULL or (X *) 0. Surprisingly, clang (gcc 6.3 does not give the warning) is not smart enough to then issue a "possibly dereferencing null pointer" warning.
 Therefore, leaving it up to the compiler to detect 
 uninitialized variables is unreliable, and therefore any code 
 that depends on this cannot be trusted. Code like the above 
 could be exploited by a sufficiently sophisticated hack to make 
 the uninitialized value of p coincide with something that will 
 open a security hole, and the compiler would not be able to 
 reliably warn the programmer of this problem.
I don't know that "leaving it up to the compiler" is a correct characterization. I don't see the programmer doing anything different with the warning capability in the compiler than if it wasn't there. In either case, the programmer will attempt to supply values to all the variables they have declared and are intending to use, and in the correct order.
Dec 02 2018
prev sibling next sibling parent reply Neia Neutuladh <neia ikeran.org> writes:
On Mon, 19 Nov 2018 21:23:31 +0000, Jordi Gutiérrez Hermoso wrote:
 When I was first playing with D, I managed to create a segfault by doing
 `SomeClass c;` and then trying do something with the object I thought I
 had default-created, by analogy with C++ syntax. Seasoned D programmers
 will recognise that I did nothing of the sort and instead created c is
 null and my program ended up dereferencing a null pointer.
Programmers coming from nearly any language other than C++ would find it expected and intuitive that declaring a class instance variable leaves it null. The compiler *could* give you a warning that you're using an uninitialized variable in a way that will lead to a segfault, but that sort of flow analysis gets hard fast. If you wanted the default constructor to be called implicitly, that would make nogc functions behave significantly differently (they'd forbid declarations without explicit initialization or would go back to default null), and it would be a problem for anything that doesn't have a no-args constructor (again, this would either be illegal or go back to null). Easier for everything to be consistent and everything to be initialized to null.
Nov 19 2018
parent reply Jordi =?UTF-8?B?R3V0acOpcnJleg==?= Hermoso <jordigh octave.org> writes:
On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh wrote:

 Programmers coming from nearly any language other than C++ 
 would find it expected and intuitive that declaring a class 
 instance variable leaves it null.
What do you think about making the syntax slightly more explicit and warn or possibly error out if you don't do it that way? Either SomeClass c = null; or SomeClass c = new SomeClass(); and nothing else.
 The compiler *could* give you a warning that you're using an 
 uninitialized variable in a way that will lead to a segfault, 
 but that sort of flow analysis gets hard fast.
Nulls/Nones are always a big gap in a language's type system. A common alternative is to have some Option/Maybe type like Rust or Haskell or D's Variant. How about making that required to plug the null gap?
 If you wanted the default constructor to be called implicitly,
Yeah, maybe this bit of C++ syntax isn't the best idea. What about other alternatives?
Nov 19 2018
next sibling parent NoMoreBugs <NoMoreBugs gmail.com> writes:
On Tuesday, 20 November 2018 at 00:30:44 UTC, Jordi Gutiérrez 
Hermoso wrote:
 Yeah, maybe this bit of C++ syntax isn't the best idea. What 
 about other alternatives?
You could try testing for null before dereferencing ;-) If the following code in D, did what you'd reasonably expect it to do, then you could do this: ----- module test; import std.stdio; class C { public int x; } void main() { try { C c = null; c.x = 100; } catch(Exception e) { writeln(e.msg); } } -----
Nov 19 2018
prev sibling next sibling parent reply Neia Neutuladh <neia ikeran.org> writes:
On Tue, 20 Nov 2018 00:30:44 +0000, Jordi Gutiérrez Hermoso wrote:
 On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh wrote:
 
 Programmers coming from nearly any language other than C++ would find
 it expected and intuitive that declaring a class instance variable
 leaves it null.
What do you think about making the syntax slightly more explicit and warn or possibly error out if you don't do it that way?
The prevailing idea is that warnings are either non-problems, in which case they shouldn't be emitted, or things you really need to fix, in which case they should be errors. Things that are sometimes errors can be left to lint tools.
 Either
 
    SomeClass c = null;
 
 or
 
    SomeClass c = new SomeClass();
 
 and nothing else.
That would work, though it would be mildly tedious. However, the general philosophy with D is that things should be implicitly initialized to a default state equal to the `.init` property of the type. That default state can be user-defined with structs, but with other types, it is generally an 'empty' state that has well-defined semantics. For floating point values, that is NaN. For integers, it's 0. For arrays, it's a null array with length 0. For objects and pointers, it's null.
 Nulls/Nones are always a big gap in a language's type system. A common
 alternative is to have some Option/Maybe type like Rust or Haskell or
 D's Variant.
Variant is about storing arbitrary values in the same variable. Nullable is the D2 equivalent of Option or Maybe.
 How about making that required to plug the null gap?
That's extremely unlikely to make it into D2 and rather unlikely to make it into a putative D3. However, if you feel strongly enough about it, you can write a DIP. I've used Kotlin with its null safety, and I honestly haven't seen benefits from it. I have seen some NullPointerExceptions in slightly different places and some NullPointerExceptions instead of empty strings in log messages, but that's it.
Nov 19 2018
parent aliak <something something.com> writes:
On Tuesday, 20 November 2018 at 03:24:56 UTC, Neia Neutuladh 
wrote:
 On Tue, 20 Nov 2018 00:30:44 +0000, Jordi Gutiérrez Hermoso 
 wrote:
 On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh 
 wrote:
 
 Programmers coming from nearly any language other than C++ 
 would find it expected and intuitive that declaring a class 
 instance variable leaves it null.
What do you think about making the syntax slightly more explicit and warn or possibly error out if you don't do it that way?
The prevailing idea is that warnings are either non-problems, in which case they shouldn't be emitted, or things you really need to fix, in which case they should be errors. Things that are sometimes errors can be left to lint tools.
 Either
 
    SomeClass c = null;
 
 or
 
    SomeClass c = new SomeClass();
 
 and nothing else.
That would work, though it would be mildly tedious. However, the general philosophy with D is that things should be implicitly initialized to a default state equal to the `.init` property of the type. That default state can be user-defined with structs, but with other types, it is generally an 'empty' state that has well-defined semantics. For floating point values, that is NaN. For integers, it's 0. For arrays, it's a null array with length 0. For objects and pointers, it's null.
 Nulls/Nones are always a big gap in a language's type system. 
 A common alternative is to have some Option/Maybe type like 
 Rust or Haskell or D's Variant.
Variant is about storing arbitrary values in the same variable. Nullable is the D2 equivalent of Option or Maybe.
 How about making that required to plug the null gap?
That's extremely unlikely to make it into D2 and rather unlikely to make it into a putative D3. However, if you feel strongly enough about it, you can write a DIP. I've used Kotlin with its null safety, and I honestly haven't seen benefits from it. I have seen some NullPointerExceptions in slightly different places and some NullPointerExceptions instead of empty strings in log messages, but that's it.
Think this would highly depend on your usecase. Having crashing mobile apps mostly leads to bad reviews because it's a UX nightmare for e.g. And with webservices it's a pain a lot of the times when it just crashes as well (analytics workers for e.g.). Kotlin's null safety stops you from this quite well as long as you don't interface with java libraries - then it's near useless because your compiler guarantees go out the window. But Swift... so far ... 👌 It's also a code review blessing. You just know for sure that this code won't crash and the object is "valid" because they've properly unwrapped a nullable. I can't even count the number of times (and I'd wager there're millions of similar commits) where I've put up a commit (during my c++ days) that says "fix crash" and the code is just "if(!ptr) { return; }" or a variant of that. Ok, sorry, I rambled a bit :p Cheers, - Ali
Nov 19 2018
prev sibling next sibling parent aliak <something something.com> writes:
On Tuesday, 20 November 2018 at 00:30:44 UTC, Jordi Gutiérrez 
Hermoso wrote:
 On Monday, 19 November 2018 at 21:57:11 UTC, Neia Neutuladh 
 wrote:

 [...]
What do you think about making the syntax slightly more explicit and warn or possibly error out if you don't do it that way? Either SomeClass c = null; or SomeClass c = new SomeClass(); and nothing else.
 [...]
Nulls/Nones are always a big gap in a language's type system. A common alternative is to have some Option/Maybe type like Rust or Haskell or D's Variant. How about making that required to plug the null gap?
You can give optional (https://code.dlang.org/packages/optional) a try and see if that works for you.
Nov 19 2018
prev sibling parent Dukc <ajieskola gmail.com> writes:
 Nulls/Nones are always a big gap in a language's type system. A 
 common alternative is to have some Option/Maybe type like Rust 
 or Haskell or D's Variant. How about making that required to 
 plug the null gap?
There are others too who feel like that too: https://news.ycombinator.com/item?id=18588239
Dec 04 2018
prev sibling next sibling parent reply welkam <wwwelkam gmail.com> writes:
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 Why does nobody seem to think that `null` is a serious problem 
 in D?
Because the more you learn about D the less you want to use classes. I view class as compatibility feature when you want to port Java code to D. For regular code just use structs. For inheritance you could use alias, for polymorphism - templates etc. If you want to write OOP code you dont have to always start with keyword class. And since you dont use classes you dont view null as a high priority problem.
Nov 20 2018
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 20 November 2018 at 13:27:28 UTC, welkam wrote:
 Because the more you learn about D the less you want to use 
 classes.
classes rock. You just initialize it. You're supposed to initialize *everything* anyway.
Nov 20 2018
parent reply NoMoreBugs <NoMoreBugs gmail.com> writes:
On Tuesday, 20 November 2018 at 15:46:35 UTC, Adam D. Ruppe wrote:
 On Tuesday, 20 November 2018 at 13:27:28 UTC, welkam wrote:
 Because the more you learn about D the less you want to use 
 classes.
classes rock. You just initialize it. You're supposed to initialize *everything* anyway.
a fan of classes...on the D forum? I don't get it. but of course you are right. classes do rock! In fact, there is not a better programming construct that I am aware of, the provides a better 'explicit' mapping from external objects to program constructs. Thank you Kristen and Ole-Johan.
Nov 21 2018
parent reply welkam <wwwelkam gmail.com> writes:
On Wednesday, 21 November 2018 at 09:20:01 UTC, NoMoreBugs wrote:
 On Tuesday, 20 November 2018 at 15:46:35 UTC, Adam D. Ruppe 
 wrote:
 On Tuesday, 20 November 2018 at 13:27:28 UTC, welkam wrote:
 Because the more you learn about D the less you want to use 
 classes.
classes rock. You just initialize it. You're supposed to initialize *everything* anyway.
a fan of classes...on the D forum? I don't get it. but of course you are right. classes do rock! In fact, there is not a better programming construct that I am aware of, the provides a better 'explicit' mapping from external objects to program constructs. Thank you Kristen and Ole-Johan.
One thing that bugs me in programming is that in different programming languages the same things are named differently and things that are named the same are different. For example D`s be considered the same also, but classes in C++ and Java are not the same thing. In D classes are reference type and unless you mark them as final they will have vtable. Lets face it most people dont mark their classes as final. What all this mean is that EVERY access to class member value goes trough indirection (additional cost) and EVERY method call goes trough 2 indirections (one to get vtable and second to call function(method) from vtable). Now Java also have indirect vtable calls but it also have optimization passes that convert methods to final if they are not overridden. If Java didnt do that it would run as slow as Ruby. AFAIK D doesnt have such optimization pass. On top of that some people want to check on EVERY dereference if pointer is not null. How slow you want your programs to run? Thats negatives but what benefit classes give us? First being reference type its easy to move them in memory. That would be nice for compacting GC but D doesnt have compacting GC. Second they are useful for when you need to run code that some one else wrote for your project. Something like plugin system. [sarcasm]This is happening everyday[/sarcasm] Third porting code from Java to D. Everything else you can do with struct and other D features.
Nov 21 2018
parent Neia Neutuladh <neia ikeran.org> writes:
On Wed, 21 Nov 2018 20:15:42 +0000, welkam wrote:
 In D classes are reference type and unless you mark them as final they
 will have vtable.
Even if you mark your class as final, it has a vtable because it inherits from Object, which has virtual functions. The ProtoObject proposal is for a base class that has no member functions. If you had a final class that inherited from ProtoObject instead of Object, it would have an empty vtable.
 Lets face it most people dont mark their classes as
 final. What all this mean is that EVERY access to class member value
 goes trough indirection (additional cost)
D classes support inheritance. They implicitly cast to their base types. They can add fields not present in their base types. If they were value types, this would mean you'd lose those fields when up-casting, and then you'd get memory corruption from calling virtual functions. That is a cost that doesn't happen with structs, I'll grant, but the only way to avoid that cost is to give up inheritance. And inheritance is a large part of the reason to use classes instead of structs.
 and EVERY method call goes
 trough 2 indirections (one to get vtable and second to call
 function(method) from vtable).
Virtual functions do, that is. That's the vast majority of class member function calls.
 Now Java also have indirect vtable calls
 but it also have optimization passes that convert methods to final if
 they are not overridden. If Java didnt do that it would run as slow as
 Ruby.
Yeah, no. https://git.ikeran.org/dhasenan/snippets/src/branch/master/virtualcalls/ results Java and DMD both managed to de-virtualize and inline the function. DMD can do this in simple cases; Java can do this in a much wider range of cases but can make mistakes (and therefore has to insert guard code that will go back to the original bytecode when its hunches were wrong). If it were merely devirtualization that were responsible for Java being faster than Ruby, Ruby might be ten times the duration of Java (just as dmd without optimizations is within times the duration of dmd without optimizations). You could also argue that `int += int` in Ruby is another virtual call, so it should be within twenty times the speed of Java. Instead, it's 160 times slower than Java.
 On top of that some
 people want to check on EVERY dereference if pointer is not null. How
 slow you want your programs to run?
Every program on a modern CPU architecture and modern OS checks every pointer dereference to ensure the pointer isn't null. That's what a segfault is. Once you have virtual address space as a concept, this is free.
 Thats negatives but what benefit classes give us?
 First being reference type its easy to move them in memory. That would
 be nice for compacting GC but D doesnt have compacting GC.
You can do that with pointers, too. D doesn't do that because (a) it's difficult and we don't have the people required to make it work well enough, (b) it would make it harder to interface with other languages, (c) unions mean we would be unable to move some objects and people tend to be less thrilled about partial solutions than complete ones.
 Second they
 are useful for when you need to run code that some one else wrote for
 your project. Something like plugin system. [sarcasm]This is happening
 everyday[/sarcasm]
 Third porting code from Java to D.
 
 Everything else you can do with struct and other D features.
Similarly, you can write Java-style object oriented code in C. It's hideously ugly and rather error-prone. Every project trying to do it would do it in a different and incompatible way. Walter decided a long time ago that language support for Java-style OOP was a useful component for D to have, and having a standardized way of doing it with proper language support was better than leaving it to a library.
Nov 21 2018
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault 
 by doing `SomeClass c;` and then trying do something with the 
 object I thought I had default-created, by analogy with C++ 
 syntax.
D is more similar to Java here and works like languages with in C++ objects are garbage-created by default, so you would have a similar problem there. To diagnose crashes on linux you can run your program under gdb.
Nov 20 2018
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, November 20, 2018 8:38:40 AM MST Kagamin via Digitalmars-d-learn 
wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutirrez

 Hermoso wrote:
 When I was first playing with D, I managed to create a segfault
 by doing `SomeClass c;` and then trying do something with the
 object I thought I had default-created, by analogy with C++
 syntax.
D is more similar to Java here and works like languages with in C++ objects are garbage-created by default, so you would have a similar problem there. To diagnose crashes on linux you can run your program under gdb.
In C++, if the class is put directly on the stack, then you get a similar situation to D's structs, only instead of it being default-initialized, it's default-constructed. So, you don't normally get garbage when you just declare a variable of a class type (though you do with other types, and IIRC, if a class doesn't have a user-defined default constructor, and a member variable's type doesn't have a default constructor, then that member variable does end up being garbage). However, if you declare a pointer to a class (which is really more analagous to what you're doing when declaring a class reference in D), then it's most definitely garbage, and the behavior is usually _far_ worse than segfaulting. So, while I can see someone getting annoyed about a segfault, because they forgot to initialize a class reference in D, the end result is far, far safer than what C++ does. And in most cases, you catch the bug pretty fast, because pretty much the only way that you don't catch it is if that piece of code is never tested. So, while D's approach is by no means perfect, I don't think that there's really any question that as far as memory safety goes, it's far superior to C++. - Jonathan M Davis
Nov 20 2018
prev sibling next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault 
 by doing `SomeClass c;` and then trying do something with the 
 object I thought I had default-created, by analogy with C++ 
 syntax. Seasoned D programmers will recognise that I did 
 nothing of the sort and instead created c is null and my 
 program ended up dereferencing a null pointer.

 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).

 What's the reasoning for allowing this?
The natural way forward for D is to add static analysis in the compiler that tracks use of possibly uninitialized classes (and perhaps also pointers). This has been discussed many times on the forums. The important thing with such an extra warning is to incrementally add it without triggering any false positives. Otherwise programmers aren't gonna use it.
Nov 22 2018
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 22 November 2018 at 15:38:18 UTC, Per Nordlöw wrote:

 The natural way forward for D is to add static analysis in the 
 compiler that tracks use of possibly uninitialized classes (and 
 perhaps also pointers). This has been discussed many times on 
 the forums. The important thing with such an extra warning is 
 to incrementally add it without triggering any false positives. 
 Otherwise programmers aren't gonna use it.
I'd say the problem here is not just false positives, but false negatives!
Nov 22 2018
next sibling parent Neia Neutuladh <neia ikeran.org> writes:
On Thu, 22 Nov 2018 15:50:01 +0000, Stefan Koch wrote:
 I'd say the problem here is not just false positives, but false
 negatives!
False negatives are a small problem. The compiler fails to catch some errors some of the time, and that's not surprising. False positives are highly vexing because it means the compiler rejects valid code, and that sometimes requires ugly circumlocutions to make it work.
Nov 22 2018
prev sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Thursday, 22 November 2018 at 15:50:01 UTC, Stefan Koch wrote:
 I'd say the problem here is not just false positives, but false 
 negatives!
With emphasis on _incremental_ additions to the compiler for covering more and more positives without introducing any _false_ negatives whatsoever. Without loosing compilation performance. I recall Walter saying this is challenging to get right but a very interesting task. This would make D even more competitive against languages such as Rust.
Nov 22 2018
parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Thursday, 22 November 2018 at 23:10:06 UTC, Per Nordlöw wrote:
 With emphasis on _incremental_ additions to the compiler for 
 covering more and more positives without introducing any 
 _false_ negatives whatsoever. Without loosing compilation 
 performance.
BTW, should such a compiler checking in D include pointers beside mandatory class checking?
Nov 23 2018
prev sibling next sibling parent reply SimonN <eiderdaus gmail.com> writes:
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault
 What's the reasoning for allowing this?
100 % agree that there should be non-nullable class references, they're my main missing feature in D. Likewise, I'm astonished that only few D users wish for them. I understand that it's very hard to get safely right, without code-flow analysis that Walter prefers to keep at minimum throughout D. I'm concerned about the clarity of usercode. I would like to ensure in my function signatures that only non-null class references are accepted as input, or that only non-null class references will be returned. All possibilities in current D have drawbacks: a) Add in/out contracts for over 90 % of the class variables? This is nasty boilerplate. b) Check all arguments for null, check all returned values for null? This is against the philosophy that null should be cost-free. Also boilerplate. c) Declare the function as if it accepts null, but segfault on receiving null? This looks like a bug in the program. Even if c) becomes a convention in the codebase, then when the function segfaults in the future, it's not clear to maintainers whether the function or the caller has the bug. I discussed some ideas in 2018-03: https://forum.dlang.org/post/epjwwtstyphqknavycxt forum.dlang.org -- Simon
Nov 29 2018
parent reply Atila Neves <atila.neves gmail.com> writes:
On Thursday, 29 November 2018 at 18:31:41 UTC, SimonN wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 When I was first playing with D, I managed to create a segfault
 What's the reasoning for allowing this?
100 % agree that there should be non-nullable class references, they're my main missing feature in D. Likewise, I'm astonished that only few D users wish for them.
https://github.com/aliak00/optional/blob/master/source/optional/notnull.d "But I don't like the verbosity!" alias MyClass = NotNullable!MyClassImpl;
Nov 30 2018
parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Friday, 30 November 2018 at 12:00:46 UTC, Atila Neves wrote:
 On Thursday, 29 November 2018 at 18:31:41 UTC, SimonN wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 When I was first playing with D, I managed to create a 
 segfault
 What's the reasoning for allowing this?
100 % agree that there should be non-nullable class references, they're my main missing feature in D. Likewise, I'm astonished that only few D users wish for them.
https://github.com/aliak00/optional/blob/master/source/optional/notnull.d "But I don't like the verbosity!" alias MyClass = NotNullable!MyClassImpl;
Huh neat, though it would nice to allow conversion of Nullable to NotNullable via runtime conditional checking. NotNullable!MyClassImpl = (MyClassImpvar != Null) ? MyClassImpvar : new MyClassImpvar();
Nov 30 2018
next sibling parent 12345swordy <alexanderheistermann gmail.com> writes:
On Friday, 30 November 2018 at 15:32:55 UTC, 12345swordy wrote:
 On Friday, 30 November 2018 at 12:00:46 UTC, Atila Neves wrote:
 On Thursday, 29 November 2018 at 18:31:41 UTC, SimonN wrote:
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 [...]
 [...]
100 % agree that there should be non-nullable class references, they're my main missing feature in D. Likewise, I'm astonished that only few D users wish for them.
https://github.com/aliak00/optional/blob/master/source/optional/notnull.d "But I don't like the verbosity!" alias MyClass = NotNullable!MyClassImpl;
Huh neat, though it would nice to allow conversion of Nullable to NotNullable via runtime conditional checking. NotNullable!MyClassImpl = (MyClassImpvar != Null) ? MyClassImpvar : new MyClassImpvar();
I meant new MyClassImp(), but you get the idea.
Nov 30 2018
prev sibling parent Kagamin <spam here.lot> writes:
On Friday, 30 November 2018 at 15:32:55 UTC, 12345swordy wrote:
 NotNullable!MyClassImpl = (MyClassImpvar != Null) ? 
 MyClassImpvar : new MyClassImpvar();
AFAIK it's something like NotNullable!MyClassImp m = MyClassImpvar.orElse(new MyClassImp());
Dec 01 2018
prev sibling next sibling parent reply O-N-S (ozan) <ozan.nurettin.sueel gmail.com> writes:
On Monday, 19 November 2018 at 21:23:31
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).
I love Null in an empty class variable and I use it very often in my code. It simplifies a lot. What would be a better way? (practical not theoretical) Regards Ozan
Nov 29 2018
parent Atila Neves <atila.neves gmail.com> writes:
On Friday, 30 November 2018 at 06:15:29 UTC, O-N-S (ozan) wrote:
 On Monday, 19 November 2018 at 21:23:31
 On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
 Hermoso wrote:
 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).
I love Null in an empty class variable and I use it very often in my code. It simplifies a lot. What would be a better way? (practical not theoretical) Regards Ozan
A better way is to always initialise. Invalid states should be unrepresentable.
Nov 30 2018
prev sibling parent PacMan <jckj33 gmail.com> writes:
On Monday, 19 November 2018 at 21:23:31 UTC, Jordi Gutiérrez 
Hermoso wrote:
 When I was first playing with D, I managed to create a segfault 
 by doing `SomeClass c;` and then trying do something with the 
 object I thought I had default-created, by analogy with C++ 
 syntax. Seasoned D programmers will recognise that I did 
 nothing of the sort and instead created c is null and my 
 program ended up dereferencing a null pointer.

 I'm not the only one who has done this. I can't find it right 
 now, but I've seen at least one person open a bug report 
 because they misunderstood this as a bug in dmd.

 I have been told a couple of times that this isn't something 
 that needs to be patched in the language, but I don't 
 understand. It seems like a very easy way to generate a 
 segfault (and not a NullPointerException or whatever).

 What's the reasoning for allowing this?
This is because you're transfering what you know from C++ to D, directly. You shouldn't do that, check out how the specific Foo f; wouldn't make sense for me, it's not allocated, ti's null. So right away I used Foo f = new Foo();
Dec 04 2018