digitalmars.D - null and type safety
- Brendan Miller (15/15) Nov 03 2008 So I'm curious about D as a programming language, especially as it compa...
- bearophile (7/9) Nov 03 2008 In a (system) language I want to be able to create tree data structures ...
- Thomas Leonard (16/29) Nov 04 2008 Note that the maybe types in Delight are independent of the syntax
- Andrei Alexandrescu (7/30) Nov 04 2008 It could be done if non-nullable pointers/references would be allowed in...
- bearophile (6/7) Nov 04 2008 I think some languages add something to help that because:
- Denis Koroskin (7/44) Nov 04 2008 Unfortunately, you can have null references in D:
- Walter Bright (3/6) Nov 04 2008 Yes, but those are neither type safe errors or memory safe errors. A
- Jarrett Billingsley (5/12) Nov 04 2008 Dereferencing a null pointer is *always* a bug, it doesn't matter how
- Walter Bright (7/11) Nov 04 2008 Sure. But I'm interested in creating a safe subset of D, and so the more...
- Jarrett Billingsley (7/13) Nov 04 2008 Have you looked at Delight at all? I wouldn't call the impact of
- bearophile (5/6) Nov 04 2008 Beside the topic of nullable types you are discussing about, Delight's l...
- Walter Bright (4/19) Nov 04 2008 Memory corruption is a big one. Another are sequential consistency bugs,...
- Robert Fraser (6/21) Nov 04 2008 FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've...
- Lionello Lunesu (6/11) Nov 04 2008 Hear hear!
- Walter Bright (13/17) Nov 05 2008 That isn't really the point of const. The point of const is to be able
- Nick B (5/5) Nov 05 2008 Can someone explain what is the plan for when Tango turns 1.0
- Steven Schveighoffer (18/25) Nov 05 2008 I was working in C# today, and I realized one very excellent design
- Walter Bright (9/10) Nov 05 2008 If that cannot be done in D, then D needs some design improvements.
- bearophile (31/34) Nov 05 2008 I presume you meant something like this:
- Hxal (10/21) Nov 06 2008 Does that mean we're getting implicit cast overloads?
- Walter Bright (2/4) Nov 06 2008 opImplicitCast, yes.
- BCS (18/34) Nov 06 2008 Why not explicitly support this with bodied typedefs?
- cemiller (5/11) Nov 04 2008 null pointers DO cause memory corruption:
- Walter Bright (3/7) Nov 04 2008 Yes, but so will any pointer that you index out of bounds. That's why
- Brendan Miller (20/27) Nov 04 2008 Well.. I can't speak for null pointers in D, but they can definitely cau...
-
Walter Bright
(11/21)
Nov 04 2008
Those machines are obsolete, for excellent reasons
. If, for some - Jarrett Billingsley (23/29) Nov 04 2008 The implication of non-nullable types isn't that nullable types
- Walter Bright (5/25) Nov 05 2008 I don't see what you've gained here. The compiler certainly can do flow
- Michel Fortin (79/83) Nov 05 2008 I'm not sure how you're reading things, but to me having two kinds of
- bearophile (8/26) Nov 05 2008 The same is true making integral values become range values. If I want t...
- Michel Fortin (10/24) Nov 06 2008 It's exactly the same thing, except that for numbers you may want much
- Jarrett Billingsley (23/49) Nov 05 2008 What? Is your response in response to my post at all? I am not
- Walter Bright (6/9) Nov 05 2008 Sure, which is why I was puzzled at the example given, which is about
- Bill Baxter (8/51) Nov 05 2008 I didn't really get what you meant the first time either. The thing
- Jarrett Billingsley (8/13) Nov 05 2008 It's almost the same as D's variable-inside-an-if, with the addition
- Brendan Miller (12/37) Nov 05 2008 You mean D isn't mean to run on embedded hardware? I thought it was a sy...
- Mike Hearn (2/6) Nov 28 2008 Can't that be solved by reversing the syntax, ie, you mark variables tha...
So I'm curious about D as a programming language, especially as it compares to C++. One problem that C++ made a partial effort to solve was that normal pointers in Consider that null can always be assigned to a pointer or reference to type T in those languages, and null is clearly *not* of type T, thus operations on a variable denoted of type T, are doomed to fail. T *myObject = null; myObject->myMethod(); // fails, despite the fact that myObject is of type T // and myMethod is defined for type T. Null is a holdover from C and has no place in a typesafe language. The designers of C++ knew this, and so introduced the c++ reference type: T &myObjectRef = ...; which cannot be null. T &myObjectRef = null; // fails at compile type T &myObjectRef = *ptr; // if ptr is null, operation is "undefined". purposes and never really thought through these issues (although I read an occured to him too late to fix). This is obviously a problem. Everyone knows that null pointer exceptions in runtime errors. Furthermore, there's no reason whatsoever that these problems can't be caught by the compiler in a strongly typed language like C++ that has the idea of a non-nullable pointer. The whole point of type annoations is to catch these errors before runtime after all. Otherwise it's just a lot of useless typing. C++ partially solves the problem partially with references, and truly static safe typed language like ML solve this problem by making variables typesafe by default, and using an Optional type to wrap nullable types. So my question, is, as the successor to C++, how does D solve this problem? I'm looking through the D docs, that are a little sparse, but I'm not seeing any references to pointers that can't be nulled. Brendan
Nov 03 2008
Brendan Miller:Null is a holdover from C and has no place in a typesafe language.<In a (system) language I want to be able to create tree data structures that contain a cargo plus true pointers, and those pointers to structs can be null in leaves. Having a safe language is good, but I want 1 language that gives me sharp tools too.I'm looking through the D docs, that are a little sparse, but I'm not seeing any references to pointers that can't be nulled.<You may like the Delight language (it compiles to D2): http://delight.sourceforge.net/null.html Bye, bearophile
Nov 03 2008
On Mon, 03 Nov 2008 17:04:48 -0500, bearophile wrote:Brendan Miller:Note that the maybe types in Delight are independent of the syntax changes (apart from the ? type annotation) and you could easily enable them when compiling D code too (basically, just remove the code that disables this feature when it detects it's compiling D syntax source). The basic problem is that it's hard to integrate a language with non- nullable types with libraries written without them (e.g. when auto- generating bindings with BCD). This would likely be a big problem for D, since integrating with existing C and C++ code is a major feature. Even if you add the annotations manually to an existing library, you often get a poor API. e.g. this GLib function for copying a string: char*? g_strdup(const(char)*? s) If you pass NULL in, you get NULL out. That's a useful convenience in C, but in Delight it forces you to check whether the result is null before you can use it. If the API had been designed for Delight in the first place, it wouldn't take or return a nullable type.Null is a holdover from C and has no place in a typesafe language.<In a (system) language I want to be able to create tree data structures that contain a cargo plus true pointers, and those pointers to structs can be null in leaves. Having a safe language is good, but I want 1 language that gives me sharp tools too.I'm looking through the D docs, that are a little sparse, but I'm not seeing any references to pointers that can't be nulled.<You may like the Delight language (it compiles to D2): http://delight.sourceforge.net/null.html
Nov 04 2008
Thomas Leonard wrote:On Mon, 03 Nov 2008 17:04:48 -0500, bearophile wrote:It could be done if non-nullable pointers/references would be allowed in addition to nullable ones. A note however - although I agree it's nice to have non-nullable types, the argument is a bit overstated as type safety has little to do with it. Null checks are easy and cheap to check for deterministically. AndreiBrendan Miller:Note that the maybe types in Delight are independent of the syntax changes (apart from the ? type annotation) and you could easily enable them when compiling D code too (basically, just remove the code that disables this feature when it detects it's compiling D syntax source). The basic problem is that it's hard to integrate a language with non- nullable types with libraries written without them (e.g. when auto- generating bindings with BCD). This would likely be a big problem for D, since integrating with existing C and C++ code is a major feature.Null is a holdover from C and has no place in a typesafe language.<In a (system) language I want to be able to create tree data structures that contain a cargo plus true pointers, and those pointers to structs can be null in leaves. Having a safe language is good, but I want 1 language that gives me sharp tools too.I'm looking through the D docs, that are a little sparse, but I'm not seeing any references to pointers that can't be nulled.<You may like the Delight language (it compiles to D2): http://delight.sourceforge.net/null.html
Nov 04 2008
Andrei Alexandrescu:Null checks are easy and cheap to check for deterministically.I think some languages add something to help that because: - Sometimes you may forget to add those checks manually if you ned. - If you don't need to put those checks the code results a little shorter and cleaner. Some kind of programmers like this. C++ programmers probably care less of this. Bye, bearophile
Nov 04 2008
On Tue, 04 Nov 2008 00:10:29 +0300, Brendan Miller <catphive catphive.net> wrote:So I'm curious about D as a programming language, especially as it compares to C++. One problem that C++ made a partial effort to solve was that normal essentially aren't type safe. Consider that null can always be assigned to a pointer or reference to type T in those languages, and null is clearly *not* of type T, thus operations on a variable denoted of type T, are doomed to fail. T *myObject = null; myObject->myMethod(); // fails, despite the fact that myObject is of type T // and myMethod is defined for type T. Null is a holdover from C and has no place in a typesafe language. The designers of C++ knew this, and so introduced the c++ reference type: T &myObjectRef = ...; which cannot be null. T &myObjectRef = null; // fails at compile type T &myObjectRef = *ptr; // if ptr is null, operation is "undefined".Unfortunately, you can have null references in D: Object o = null;marketing purposes and never really thought through these issues (although I read an interview where Anders Hejlsberg admitted this was aThis is obviously a problem. Everyone knows that null pointer exceptions sources of runtime errors. Furthermore, there's no reason whatsoever that these problems can't be caught by the compiler in a strongly typed language like C++ that has the idea of a non-nullable pointer. The whole point of type annoations is to catch these errors before runtime after all. Otherwise it's just a lot of useless typing. C++ partially solves the problem partially with references, and truly static safe typed language like ML solve this problem by making variables typesafe by default, and using an Optional type to wrap nullable types. So my question, is, as the successor to C++, how does D solve this problem? I'm looking through the D docs, that are a little sparse, but I'm not seeing any references to pointers that can't be nulled. BrendanNullable types have been proposed several times by many people, but I don't recall any respond from Walter or Andrei :(
Nov 04 2008
Brendan Miller wrote:This is obviously a problem. Everyone knows that null pointer the biggest sources of runtime errors.Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Nov 04 2008
On Tue, Nov 4, 2008 at 3:32 PM, Walter Bright <newshound1 digitalmars.com> wrote:Brendan Miller wrote:Dereferencing a null pointer is *always* a bug, it doesn't matter how "safe" it is. Don't you think that eliminating something that's always a bug at compile time is a worthwhile investment?This is obviously a problem. Everyone knows that null pointer the biggest sources of runtime errors.Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Nov 04 2008
Jarrett Billingsley wrote:Dereferencing a null pointer is *always* a bug, it doesn't matter how "safe" it is.Sure. But I'm interested in creating a safe subset of D, and so the more correct interpretation of what constitutes "safety" is important.Don't you think that eliminating something that's always a bug at compile time is a worthwhile investment?Not always. There's a commensurate increase in complexity that may not make it worth while. My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.
Nov 04 2008
On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright <newshound1 digitalmars.com> wrote:Have you looked at Delight at all? I wouldn't call the impact of nullable types on D "commensurate." It's probably far less than const, invariant, pure, and escape analysis.Don't you think that eliminating something that's always a bug at compile time is a worthwhile investment?Not always. There's a commensurate increase in complexity that may not make it worth while.My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.Can you expand upon this a bit? What exactly are some bugs that can't be reliably detected at runtime other than memory corruption?
Nov 04 2008
Jarrett Billingsley:Have you looked at Delight at all?Beside the topic of nullable types you are discussing about, Delight's look is designed to appeal mostly to Python programmers (despite being just D2, a little sugared), and/or to people that care a lot about having a clean(er) syntax, so C/C++ programmers may be less interested... If Delight becomes refined and debugged enough, I hope to see it bundled by default with the LDC compiler, as well as Tango, a GUI toolkit like GTK for D, and few other goodies, like an editor/almost-IDE. I think it can become a way to "sell" D2 to other kind of programmers. Bye, bearophile
Nov 04 2008
Jarrett Billingsley wrote:On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright <newshound1 digitalmars.com> wrote:Sorry, I have not looked at Delight.Have you looked at Delight at all? I wouldn't call the impact of nullable types on D "commensurate." It's probably far less than const, invariant, pure, and escape analysis.Don't you think that eliminating something that's always a bug at compile time is a worthwhile investment?Not always. There's a commensurate increase in complexity that may not make it worth while.Memory corruption is a big one. Another are sequential consistency bugs, then there's function hijacking.My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.Can you expand upon this a bit? What exactly are some bugs that can't be reliably detected at runtime other than memory corruption?
Nov 04 2008
Walter Bright wrote:Jarrett Billingsley wrote:FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've often run into bugs that non-nullable types could have prevented (including one on a production system... well, there was another bug that raised an exception causing something else to be uninitialized and the system came crashing down).Dereferencing a null pointer is *always* a bug, it doesn't matter how "safe" it is.Sure. But I'm interested in creating a safe subset of D, and so the more correct interpretation of what constitutes "safety" is important.Don't you think that eliminating something that's always a bug at compile time is a worthwhile investment?Not always. There's a commensurate increase in complexity that may not make it worth while. My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.
Nov 04 2008
"Robert Fraser" <fraserofthenight gmail.com> wrote in message news:geregs$sj1$1 digitalmars.com...FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've often run into bugs that non-nullable types could have prevented (including one on a production system... well, there was another bug that raised an exception causing something else to be uninitialized and the system came crashing down).Hear hear! Nullness should have nothing to do with a type having reference or value semantics. These two concepts are orthogonal. L.
Nov 04 2008
Robert Fraser wrote:Walter Bright wrote:That isn't really the point of const. The point of const is to be able to write functions that can accommodate both mutable and invariant arguments. The point of invariantness is to be able to prove that code has certain properties. This is much better than relying on your programming team never making a mistake. For example, you can do functional programming in C++. It's just that the compiler cannot know you're doing that, and so cannot take advantage of it. Furthermore, the compiler cannot detect when code is not functional, and so if someone hands you a million lines of code you have no freakin' way of determining if it adheres to functional principles or not. This really matters when one starts doing concurrent programming.My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.FWIW, I've _never_ run into a bug const could have prevented.
Nov 05 2008
Can someone explain what is the plan for when Tango turns 1.0 Will the code be frozen ? Will the version of D it runs with be frozen ? cheers Nick B
Nov 05 2008
"Walter Bright" wroteJarrett Billingsley wrote:dereference errors. In D, it simply doesn't happen, because the array has a guard that is stored with the reference -- the length. I think these similar to the kinds of things that Jarrett is referring to. Something that's like a pointer, but can't ever be null, so you never have to check it for null before using it. Except Jarrett's idea eliminates it at compile time vs. run time. Couldn't one design a struct wrapper that implements this behavior? Something like: NonNullable!(T) { opAssign(T t) {/* make sure t is not null */} opAssign(NonNullable!(T) t) {/* no null check */} ... } -SteveDon't you think that eliminating something that's always a bug at compile time is a worthwhile investment?Not always. There's a commensurate increase in complexity that may not make it worth while. My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.
Nov 05 2008
Steven Schveighoffer wrote:Couldn't one design a struct wrapper that implements this behavior?If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.
Nov 05 2008
Walter Bright:For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i;I presume you meant something like this: Ranged!(int, N, M) i; That has to work (assert and when possible statically assert) in situations like the following ones too: i = long.max; // static err i = ulong.max; // static err Ranged!(ulong, 0, ulong.max) ul, r1, r2; ul = ulong.max + 10; // static err r1 = ulong.max / 2 + 100; r2 = ulong.max / 2 + 100; r1 + r2 // runtime err As you have discussed about recently, for a compiler designer choosing what goes in the language and what to keep out of it and into libs is very important. Time ago I have read about a special Scheme *compiler* that is very small and very "pluggable", so it's very small, and a large part of the language is implemented by external libs, even a *static* type system, and several other things that usually are assumed part of the compiler itself. It's not just a matter of creating a very flexible compiler core: even if you somehow are able to create it, then there are other problems to be solved: designing languages is quite hard, so you can't expect an average programmer to be a good designer like that (that's also why AST macros may lead to some troubles). So if you push things out of the language, you probably have to put them into standard libs, so normal programmers can use a standard and well designed version of them. Otherwise it leads to a lot of troubles that I don't list now. How can we establish if ranged integral values have to be outside the compiler or inside? We can list requirements, etc. Generally the more things are pushed into the compiler, the more complex it becomes, slower to improve and mantain, and such features can also become more rigid (this can be seen very well with D unittests and ddoc. I think that eventually D unittests and ddoc may have to be removed to the language, and put into the standard library, and the language itself may have to grow some features (some more reflection? Maybe like Flectioned?) that allow the standard library code to define them with a handy & short syntax anyway. This to both reduce compiler complexity, allow more evolving capabilities to that functionalities, and allow the community of D programmers to improve them). Some features of ranged integrals: - Making integrals ranged has some different purposes, the main one is to avoid a class of runtime bugs, another purpose is to shorten some code a little. The final purpose is to have release code that has zero speed penalty compared to the D code of today. Some of those bugs are controlled by runtime code and other of them can probably be avoided at compile time, by the type system. The compiler can also avoid putting some runtime controls where it infers some values are into certain values. The code inside contracts (I mean of the contract programming) can be also used by the compiler to infer where it can remove more of those runtime controls. - A short handy syntax is important enough, because for such ranges to become part of the D culture they may have to be handy, short, etc. If D has some features that most D programmers don't use, then they become less useful. - Probably to avoid integral-related bugs the compiler and the runtime have to control all integral values used by the program, because letting the programmer use few of them in special points is probably a way to not see them used much. For the same purpose such controls probably need to be on (activated) by default, like array range controls. - Recently I have shown a possible syntax to disable/enable some controls locally, with a syntax like: safe(stack, ranges, arrays) {...} unsafe(ranges) {...} I think such syntax is better than the syntax used by ObjectPascal for such purposes. - Once and where disabled such controls must have to cost zero at runtime, because D is designed to be quick. I presume the SafeD language has to keep them always activated (that's why having the compiler remove some of them automatically or using the contracts is useful). - From the links I have shown here recently you can learn how much common is such class of integral-related bugs. And generally you can't talk about a "Safe-D" if you can't sum two integral values reliably :-) - Range types of integers/chars/enums are useful, but also are useful subranges, that are essentially subtypes specialized for just this purpose. So if: typedef int:1..6 TyDice; typedef TyDice:1..3 TyHalfDice; then a function that takes a TyDice automatically accepts a TyHalfDice too. Note that Haskell type system is so powerful that it allows the programmer to define such semantics, subtypes, etc. But D type system is quite more "primitive", so some of such things may be need to be cabled instead of being user-defined (by programmers that know a lot of type theory, of course). - So I think that while unittests and ddoc may be better out of the compiler, range types may be better into it (note that I use unittests and ddoc _all the time_, all my programs use them heavily, I like them. I am not saying this because I don't like unittests and documentation strings). Bye, bearophile
Nov 05 2008
Walter Bright wrote:If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.Does that mean we're getting implicit cast overloads? Because without RangedInt!(N, M).opImplicitCastFrom(int i) you can't pass int values to functions accepting RangedInt instances. You can't pass a different RangedInt!(X, Y) either. It defeats the purpose of implicit range checking if you have to write litanies like foo(RangedInt!(1,10).check(i)) just to call a function. Sorry to jump the topic like that, but last time I asked my thread got hijacked. :P
Nov 06 2008
Hxal wrote:Does that mean we're getting implicit cast overloads? Because without RangedInt!(N, M).opImplicitCastFrom(int i)opImplicitCast, yes.
Nov 06 2008
Reply to Walter,Steven Schveighoffer wrote:Why not explicitly support this with bodied typedefs? typedef int MyInt(int m, int M) { MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; } MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work but... { static if(m > m_ && M_ > M) assert(m <= i && i <= M); // only assert as needed else static if(m > m_) assert(m <= i); else static if(M < M_) assert(i <= M); this = i; } // this is only to define the return type, the normal code for int is still generated MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; }Couldn't one design a struct wrapper that implements this behavior?If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.
Nov 06 2008
BCS wrote:Reply to Walter,Just create a struct with an int as its only member?Steven Schveighoffer wrote:Why not explicitly support this with bodied typedefs? typedef int MyInt(int m, int M) { MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; } MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work but... { static if(m > m_ && M_ > M) assert(m <= i && i <= M); // only assert as needed else static if(m > m_) assert(m <= i); else static if(M < M_) assert(i <= M); this = i; } // this is only to define the return type, the normal code for int is still generated MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; }Couldn't one design a struct wrapper that implements this behavior?If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.
Nov 07 2008
Reply to KennyTM~,Just create a struct with an int as its only member?But if you do that then you have to explicitly build all of the math overloads. Sometimes, they ALL end up as simple shells so why force the programer to build them all? Also it forces the compiler to use the int code generator rather than potentially not inlineing.
Nov 07 2008
On Tue, 04 Nov 2008 12:32:59 -0800, Walter Bright <newshound1 digitalmars.com> wrote:Brendan Miller wrote:null pointers DO cause memory corruption: byte* foo = null; // NULL! foo[1244916] = 5; // WORKS; CORRUPTS!This is obviously a problem. Everyone knows that null pointer the biggest sources of runtime errors.Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Nov 04 2008
cemiller wrote:null pointers DO cause memory corruption: byte* foo = null; // NULL! foo[1244916] = 5; // WORKS; CORRUPTS!Yes, but so will any pointer that you index out of bounds. That's why safe D will not allow arithmetic on pointers.
Nov 04 2008
Walter Bright Wrote:Brendan Miller wrote:Well.. I can't speak for null pointers in D, but they can definitely cause memory corruption in C++. Not all OS's have memory protection. *remembers the good old days of Mac OS system 7* Back to the important point! A couple of times in this thread I've seen people suggest that null pointers are type safe. I don't see how that statement is justifiable. People accept null because it's always been there for those of us who are long time C coders. What you have to remember, is C was not type safe in any way shape or form. First off, let's clarify that we're talking about *static* type safety. Languages like python are dynamically type safe because at runtime you will see an exception thrown if you try to perform an operation on a type that it does not support it. If you have a reference in python, you can point it to whatever the hell you want and the runtime will prevent you from performing the wrong operation on the wrong data. It's a more limited form of type checking than static type checking, but many people find this acceptable. In a statically typed language, it is *impossible* to perform an operation on a type that it does not support because at compile time you know the types of the objects. Concretely null is a pointer to address zero. For some type T, there is never any T at address zero. Therefor a statically typed language will prevent you from assigning a poitner to an object that is not of type T to a pointer decleared to be type T. That's *the entire point* of static typing. T* means "that which I point to is in the set of T". T sans the star means "I am in the set of T". Not sometimes. Not maybe. Always. Yes, you can also get performance benefits from type annotations... but that doesn't make the langauge statically type *safe*. Now of course, sometimes we do want to a pointer to type T to be null... but what does that *mean*? It means, you have a variable that sometimes you want to hold a pointer to T... and sometimes you don't want to hold a pointer to T. This is called a variant. Different languages implement variants in different ways and have different names for them. In C, they are called unions. C, again, is not type *safe* so if you try to treat a union as the wrong type, it will let you. However, in most langauges, variants provide dynamic typing for variants, and thus offer the lesser form of type safety. C and C++ pointers to T are variants of type T and the type of NULL. Except, of course, like unions they aren't type safe even dynamically because the runtime won't stop you from derefencing null. The operating system *will* stop you by killing your process, if you are on a system with protected memory because address zero is not accessible to userspace on most systems. *most* systems, not all. Think about this in terms of set theory and the idea should become clear. Null should not be assignable to a pointer to T because the object it points to at address zero does not lie within the set of T's. If it did lie within the set of T's, then this should be valid: T myObject; my Object = *NULL; It shouldn't even require a type cast because type casts are ways of breaking out of static typing. But it does in C++. In fact, this code generates: error: invalid type argument of `unary *' Damn right. Now, really, what's so hard about adding a statically type safe pointer? C++ already did it, and they are called references. My complaint here, after all, was that D is apparently less type safe than C++. Now, I have other problems with C++ references. That they have value semantics is just stupid (especially since they are *called* references!). Type safety and value vs reference semantics have nothing to do with one another. Indeed, added nullable value types. BrendanThis is obviously a problem. Everyone knows that null pointer the biggest sources of runtime errors.Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Nov 04 2008
Brendan Miller wrote:Well.. I can't speak for null pointers in D, but they can definitely cause memory corruption in C++. Not all OS's have memory protection. *remembers the good old days of Mac OS system 7*Those machines are obsolete, for excellent reasons <g>. If, for some reason, a D implementation needs to be implemented for such a machine, the solution is to optionally insert a runtime check analogously to array bounds checking.Concretely null is a pointer to address zero. For some type T, there is never any T at address zero. Therefor a statically typed language will prevent you from assigning a poitner to an object that is not of type T to a pointer decleared to be type T. That's *the entire point* of static typing. T* means "that which I point to is in the set of T". T sans the star means "I am in the set of T". Not sometimes. Not maybe. Always.I understand your point, and it sounds right technically. But practically, I'm not convinced. For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?
Nov 04 2008
On Tue, Nov 4, 2008 at 11:40 PM, Walter Bright <newshound1 digitalmars.com> wrote:I understand your point, and it sounds right technically. But practically, I'm not convinced. For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?The implication of non-nullable types isn't that nullable types disappear; quite the opposite, in fact. Nullable types have obvious use for exactly the reason you explain. The problem arises when nullable types are used in situations where it makes _no sense_ for null to appear. This is where bugs show up. In a system that has both nullable and non-null types, nullable types act as a sort of container, preventing you from accessing anything through them as it cannot be statically proven that the access will be legal at runtime. In order to access something from a nullable type, you have to convert it to a non-null type. Delight uses D's "declare a variable in the condition of an if or while" to great effect here: if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null { // f is known not to be null. } else { // something else happened. Handle it. } Null still has a purpose. It's just that its purpose is really only to signal a special case.
Nov 04 2008
Jarrett Billingsley wrote:The implication of non-nullable types isn't that nullable types disappear; quite the opposite, in fact. Nullable types have obvious use for exactly the reason you explain. The problem arises when nullable types are used in situations where it makes _no sense_ for null to appear. This is where bugs show up. In a system that has both nullable and non-null types, nullable types act as a sort of container, preventing you from accessing anything through them as it cannot be statically proven that the access will be legal at runtime. In order to access something from a nullable type, you have to convert it to a non-null type. Delight uses D's "declare a variable in the condition of an if or while" to great effect here: if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null { // f is known not to be null. } else { // something else happened. Handle it. }I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.
Nov 05 2008
On 2008-11-05 03:18:50 -0500, Walter Bright <newshound1 digitalmars.com> said:I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.I'm not sure how you're reading things, but to me having two kinds of pointers (nullable and non-nullable) is exactly what you need to enable nullness flow analysis across function boundaries. Basically, if you declare some pointer to be restricted to not-null in a function signature, and then try to call the function by passing it a possibly null pointer, the compiler can tell you that you need to check for null at the call site before calling the function. It then ensue that when given a non-nullable pointer you can call a function requiring a non-nullable pointer directly without any check for null, because you know the pointer you recieved can't be null. Currently, you can acheive this with proper documentation of functions saying whether arguments accept null and if return values can return null, and write your code with those assumptions in mind. Most often than not however there is no such documentation and you find yourself checking for null a lot more than necessary. If this property about pointers in function parameters and return values were known to the compiler, the compiler could check for you that you're doing things correctly, warn you whenever you're forgetting a null check, and optimise away checks for null on these pointers. I know the null-dereferencing problem can generally be caught easily at runtime, but sometime your null comes from far away in the program (someone set a global to null for instance) and you're left to wonder who put a null value there in the first place. Non-nullable pointers would help a lot in those cases because you no longer have to test every code path and the error of giving a null value would be caught at the source (with the compiler telling you to check against null), not only where it's being dereferenced. - - - That said, I think this could be done using an template. Consider this: struct NotNullPtr(Type) { private Type* ptr; this(Type* ptr) { opAssign(ptr); } void opAssign(Type* ptr) { // if this gets inlined and you have already checked for null, then // hopefully the optimizer will remove this redundant check. if (ptr) this.ptr = ptr; else throw new Exception("Unacceptable null value."); } void opAssign(NotNullPtr other) { this.ptr = other.ptr; } Type* opCast() { return ptr; } ref Type opDeref() { return &ptr; } alias opDeref opStar; // ... implement the rest yourself } (not tested) You could use this template everywhere you want to be sure a pointer isn't null. It guarenties that its value will never be null, and will throw an exception at the source where you attempt to put a null value in it, not when you attempt to dereference it later, when it's too late and your program has already been put in an incorrect state. NotNullPtr!(int) globalThatShouldNotBeNull; int* foo(); globalThatShouldBeNull = foo(); // will throw if you attempt to set it to null. void bar(NotNullPtr!(int) arg); bar(globalThatShouldNotBeNull); // no need to check for null. The greatest downside to this template is that since it isn't part of the language, almost no one will use it in their function prototypes and return types. That's not counting that its syntax is verbose and not very appealing (although it's not much worse than boost::shared_ptr or std::auto_ptr). But still, if you have a global or member variable that must not be null, it can be of use; and if you have a function where you want to put the burden of checking for null on the caller, it can be of use. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 05 2008
Michel Fortin:Basically, if you declare some pointer to be restricted to not-null in a function signature, and then try to call the function by passing it a possibly null pointer, the compiler can tell you that you need to check for null at the call site before calling the function. It then ensue that when given a non-nullable pointer you can call a function requiring a non-nullable pointer directly without any check for null, because you know the pointer you recieved can't be null. Currently, you can acheive this with proper documentation of functions saying whether arguments accept null and if return values can return null, and write your code with those assumptions in mind. Most often than not however there is no such documentation and you find yourself checking for null a lot more than necessary. If this property about pointers in function parameters and return values were known to the compiler, the compiler could check for you that you're doing things correctly, warn you whenever you're forgetting a null check, and optimise away checks for null on these pointers.The same is true making integral values become range values. If I want to write a function that takes an iterable of results of throwing a dice, I can use an enum, or control every item of the iterable for being in range 1 - 6. If range values are available I can just: StatsResults stats(Dice[] throwing_results) { ... Where Dice is: typedef int:1..7 Dice; I then don't need to remember to control items for being in 1-6 inside stats(), and the control is pushed up, toward the place where that throwing_results was created (or where it comes from disk, user input, etc). This avoids some bugs and reduces some code. Bye, bearophile
Nov 05 2008
On 2008-11-05 08:16:59 -0500, bearophile <bearophileHUGS lycos.com> said:The same is true making integral values become range values. If I want to write a function that takes an iterable of results of throwing a dice, I can use an enum, or control every item of the iterable for being in range 1 - 6. If range values are available I can just: StatsResults stats(Dice[] throwing_results) { ... Where Dice is: typedef int:1..7 Dice; I then don't need to remember to control items for being in 1-6 inside stats(), and the control is pushed up, toward the place where that throwing_results was created (or where it comes from disk, user input, etc). This avoids some bugs and reduces some code.It's exactly the same thing, except that for numbers you may want much more than simple ranges. You could want non-zero numbers, odd or even numbers, square numbers, etc. I have the feeling that whatever the language try to restrict about numbers, it will never be enough. So my feeling is that this is better left to a template. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 06 2008
On Wed, Nov 5, 2008 at 3:18 AM, Walter Bright <newshound1 digitalmars.com> wrote:Jarrett Billingsley wrote:What? Is your response in response to my post at all? I am not talking about flow analysis on "normal" pointer types. I am talking about the typing system actually being modified to allow a programmer to express the idea, with a _type_, and not with static checking, that a reference/pointer value _may not be null_. In a type system with non-null types, if a function takes a non-null parameter and you pass it a nullable pointer, _you get an error at compile time_. // foo takes a non-null int*. void foo(int* x) { writefln("%s", *x); } // bar returns a nullable int* - an int*?. int*? bar(int x) { if(x < 10) return new int(x); else return null; } foo(bar(3)); // compiler error, you can't pass a potentially null type into a parameter that can't be null, moron if(auto p = bar(3)) foo(p); // ok else throw new Exception("Wah wah wah bar returned null"); With nullable types, flow analysis doesn't have to be done. It is implicit in the types. It is mangled into function names. foo _cannot_ take a pointer that may be null. End of story.The implication of non-nullable types isn't that nullable types disappear; quite the opposite, in fact. Nullable types have obvious use for exactly the reason you explain. The problem arises when nullable types are used in situations where it makes _no sense_ for null to appear. This is where bugs show up. In a system that has both nullable and non-null types, nullable types act as a sort of container, preventing you from accessing anything through them as it cannot be statically proven that the access will be legal at runtime. In order to access something from a nullable type, you have to convert it to a non-null type. Delight uses D's "declare a variable in the condition of an if or while" to great effect here: if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null { // f is known not to be null. } else { // something else happened. Handle it. }I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.
Nov 05 2008
Jarrett Billingsley wrote:With nullable types, flow analysis doesn't have to be done. It is implicit in the types. It is mangled into function names. foo _cannot_ take a pointer that may be null. End of story.Sure, which is why I was puzzled at the example given, which is about something else entirely. What you're talking about is a type constructor to create another kind of pointer. It's a significant increase in complexity. That's why I was talking about complexity being a downside of this - there is a tradeoff.
Nov 05 2008
On Wed, Nov 5, 2008 at 10:43 PM, Jarrett Billingsley <jarrett.billingsley gmail.com> wrote:On Wed, Nov 5, 2008 at 3:18 AM, Walter Bright <newshound1 digitalmars.com> wrote:I didn't really get what you meant the first time either. The thing about Delight's use of auto "to great effect" wasn't clear. I assumed it was basically the same as D's auto inside an if, but I see now that it's not. Looks like a run-time type deduction, even though its not really. Kinda neat. --bbJarrett Billingsley wrote:What? Is your response in response to my post at all? I am not talking about flow analysis on "normal" pointer types. I am talking about the typing system actually being modified to allow a programmer to express the idea, with a _type_, and not with static checking, that a reference/pointer value _may not be null_. In a type system with non-null types, if a function takes a non-null parameter and you pass it a nullable pointer, _you get an error at compile time_. // foo takes a non-null int*. void foo(int* x) { writefln("%s", *x); } // bar returns a nullable int* - an int*?. int*? bar(int x) { if(x < 10) return new int(x); else return null; } foo(bar(3)); // compiler error, you can't pass a potentially null type into a parameter that can't be null, moron if(auto p = bar(3)) foo(p); // ok else throw new Exception("Wah wah wah bar returned null"); With nullable types, flow analysis doesn't have to be done. It is implicit in the types. It is mangled into function names. foo _cannot_ take a pointer that may be null. End of story.cannot be statically proven that the access will be legal at runtime. In order to access something from a nullable type, you have to convert it to a non-null type. Delight uses D's "declare a variable in the condition of an if or while" to great effect here: if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null { // f is known not to be null. } else { // something else happened. Handle it. }I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.
Nov 05 2008
On Wed, Nov 5, 2008 at 9:33 AM, Bill Baxter <wbaxter gmail.com> wrote:I didn't really get what you meant the first time either. The thing about Delight's use of auto "to great effect" wasn't clear. I assumed it was basically the same as D's auto inside an if, but I see now that it's not. Looks like a run-time type deduction, even though its not really. Kinda neat.It's almost the same as D's variable-inside-an-if, with the addition that you can use it to convert a nullable type to a non-null type. Hence, in: if(auto f = someFunctionThatCanReturnNull()) if someFunctionThatCanReturnNull returns an int*? (nullable pointer to int), typeof(f) will just be int* (non-null pointer to int), since in the scope of the if statement, f is provably non-null.
Nov 05 2008
Walter Bright Wrote:Brendan Miller wrote:You mean D isn't mean to run on embedded hardware? I thought it was a systems programming language? A lot of hardware today has no MMU, which as I understand it means you can't have memory protection. If you are only targeting x86 after the ability to have memory protection was added to the hardware... then you can make all kinds of assumptions I guess.Well.. I can't speak for null pointers in D, but they can definitely cause memory corruption in C++. Not all OS's have memory protection. *remembers the good old days of Mac OS system 7*Those machines are obsolete, for excellent reasons <g>. If, for some reason, a D implementation needs to be implemented for such a machine, the solution is to optionally insert a runtime check analogously to array bounds checking.The short answer is you use a variant. It handles the the case where you would use null slightly better (because it is dynamically type safe). For a C style language, just having two kind of pointers might be more natural. Like maybe: safe T* object1; // does not permit null. unsafe T* object2; // does perfmit null. and then have a cast between them: object1 = (safe T*)object2; // This throws some kind of well defined exception if object2 is null. The long answer is that the best way to learn about type safety is to check out SML or Ocaml. These are a couple of the few truly statically type safe langauge. Languages like those introduced type safety, and ideas like generics and type inference. ML is to type safety as Smalltalk is to object orientation. You will probably never write a real world program in ML, but learning it is probably the best way to get a good understanding of where things like type safety and templates came from in the first place. Also the linked list example you give is actually *way easier* in a language like ML that supports variants. The reason for this is that variants can be used in conjunction with a pattern matching syntax, which is kind of like a switch statement on steroids. As a side note, I find it interesting that C++ templates are actually much more powerful than the ML style generics they emulate. Specifically, templates can do template metaprogramming, whereas I'm pretty sure this is not possible in any ML style languages.Concretely null is a pointer to address zero. For some type T, there is never any T at address zero. Therefor a statically typed language will prevent you from assigning a poitner to an object that is not of type T to a pointer decleared to be type T. That's *the entire point* of static typing. T* means "that which I point to is in the set of T". T sans the star means "I am in the set of T". Not sometimes. Not maybe. Always.I understand your point, and it sounds right technically. But practically, I'm not convinced. For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?
Nov 05 2008
The basic problem is that it's hard to integrate a language with non- nullable types with libraries written without them (e.g. when auto- generating bindings with BCD). This would likely be a big problem for D, since integrating with existing C and C++ code is a major feature.Can't that be solved by reversing the syntax, ie, you mark variables that cannot be null rather than variables that can be. The compiler then requires you to prove the non-nullness of the value on that codepath (or cast it away with nonnull). I'd love to see non-null types in D2. Intuitively, it'd catch at compile time quite a few bugs I see in my programs (assuming a strong compiler analysis).
Nov 28 2008