www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - null and type safety

reply Brendan Miller <catphive catphive.net> writes:
So I'm curious about D as a programming language, especially as it compares to
C++.

One problem that C++ made a partial effort to solve was that normal pointers in


Consider that null can always be assigned to a pointer or reference to type T
in those languages, and null is clearly *not* of type T, thus operations on a
variable denoted of type T, are doomed to fail.

T *myObject = null;
myObject->myMethod(); // fails, despite the fact that myObject is of type T
                                         // and myMethod is defined for type T.

Null is a holdover from C and has no place in a typesafe language. The
designers of C++ knew this, and so introduced the c++ reference type:

T &myObjectRef = ...;

which cannot be null.

T &myObjectRef = null; // fails at compile type
T &myObjectRef = *ptr; // if ptr is null, operation is "undefined".


purposes and never really thought through these issues (although I read an

occured to him too late to fix).

This is obviously a problem. Everyone knows that null pointer exceptions in

runtime errors. Furthermore, there's no reason whatsoever that these problems
can't be caught by the compiler in a strongly typed language like C++ that has
the idea of a non-nullable pointer. The whole point of type annoations is to
catch these errors before runtime after all. Otherwise it's just a lot of
useless typing. C++ partially solves the problem partially with references, and
truly static safe typed language like ML solve this problem by making variables
typesafe by default, and using an Optional type to wrap nullable types.

So my question, is, as the successor to C++, how does D solve this problem? I'm
looking through the D docs, that are a little sparse, but I'm not seeing any
references to pointers that can't be nulled.

Brendan
Nov 03 2008
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Brendan Miller:
Null is a holdover from C and has no place in a typesafe language.<
In a (system) language I want to be able to create tree data structures that contain a cargo plus true pointers, and those pointers to structs can be null in leaves. Having a safe language is good, but I want 1 language that gives me sharp tools too.
I'm looking through the D docs, that are a little sparse, but I'm not seeing
any references to pointers that can't be nulled.<
You may like the Delight language (it compiles to D2): http://delight.sourceforge.net/null.html Bye, bearophile
Nov 03 2008
parent reply Thomas Leonard <talex5+d gmail.com> writes:
On Mon, 03 Nov 2008 17:04:48 -0500, bearophile wrote:

 Brendan Miller:
Null is a holdover from C and has no place in a typesafe language.<
In a (system) language I want to be able to create tree data structures that contain a cargo plus true pointers, and those pointers to structs can be null in leaves. Having a safe language is good, but I want 1 language that gives me sharp tools too.
I'm looking through the D docs, that are a little sparse, but I'm not
seeing any references to pointers that can't be nulled.<
You may like the Delight language (it compiles to D2): http://delight.sourceforge.net/null.html
Note that the maybe types in Delight are independent of the syntax changes (apart from the ? type annotation) and you could easily enable them when compiling D code too (basically, just remove the code that disables this feature when it detects it's compiling D syntax source). The basic problem is that it's hard to integrate a language with non- nullable types with libraries written without them (e.g. when auto- generating bindings with BCD). This would likely be a big problem for D, since integrating with existing C and C++ code is a major feature. Even if you add the annotations manually to an existing library, you often get a poor API. e.g. this GLib function for copying a string: char*? g_strdup(const(char)*? s) If you pass NULL in, you get NULL out. That's a useful convenience in C, but in Delight it forces you to check whether the result is null before you can use it. If the API had been designed for Delight in the first place, it wouldn't take or return a nullable type.
Nov 04 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Thomas Leonard wrote:
 On Mon, 03 Nov 2008 17:04:48 -0500, bearophile wrote:
 
 Brendan Miller:
 Null is a holdover from C and has no place in a typesafe language.<
In a (system) language I want to be able to create tree data structures that contain a cargo plus true pointers, and those pointers to structs can be null in leaves. Having a safe language is good, but I want 1 language that gives me sharp tools too.
 I'm looking through the D docs, that are a little sparse, but I'm not
 seeing any references to pointers that can't be nulled.<
You may like the Delight language (it compiles to D2): http://delight.sourceforge.net/null.html
Note that the maybe types in Delight are independent of the syntax changes (apart from the ? type annotation) and you could easily enable them when compiling D code too (basically, just remove the code that disables this feature when it detects it's compiling D syntax source). The basic problem is that it's hard to integrate a language with non- nullable types with libraries written without them (e.g. when auto- generating bindings with BCD). This would likely be a big problem for D, since integrating with existing C and C++ code is a major feature.
It could be done if non-nullable pointers/references would be allowed in addition to nullable ones. A note however - although I agree it's nice to have non-nullable types, the argument is a bit overstated as type safety has little to do with it. Null checks are easy and cheap to check for deterministically. Andrei
Nov 04 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 Null checks are easy and cheap to check for deterministically.
I think some languages add something to help that because: - Sometimes you may forget to add those checks manually if you ned. - If you don't need to put those checks the code results a little shorter and cleaner. Some kind of programmers like this. C++ programmers probably care less of this. Bye, bearophile
Nov 04 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 04 Nov 2008 00:10:29 +0300, Brendan Miller <catphive catphive.net>  
wrote:

 So I'm curious about D as a programming language, especially as it  
 compares to C++.

 One problem that C++ made a partial effort to solve was that normal  

 essentially aren't type safe.

 Consider that null can always be assigned to a pointer or reference to  
 type T in those languages, and null is clearly *not* of type T, thus  
 operations on a variable denoted of type T, are doomed to fail.

 T *myObject = null;
 myObject->myMethod(); // fails, despite the fact that myObject is of  
 type T
                                          // and myMethod is defined for  
 type T.

 Null is a holdover from C and has no place in a typesafe language. The  
 designers of C++ knew this, and so introduced the c++ reference type:

 T &myObjectRef = ...;

 which cannot be null.

 T &myObjectRef = null; // fails at compile type
 T &myObjectRef = *ptr; // if ptr is null, operation is "undefined".
Unfortunately, you can have null references in D: Object o = null;

 marketing purposes and never really thought through these issues  
 (although I read an interview where Anders Hejlsberg admitted this was a  

 This is obviously a problem. Everyone knows that null pointer exceptions  

 sources of runtime errors. Furthermore, there's no reason whatsoever  
 that these problems can't be caught by the compiler in a strongly typed  
 language like C++ that has the idea of a non-nullable pointer. The whole  
 point of type annoations is to catch these errors before runtime after  
 all. Otherwise it's just a lot of useless typing. C++ partially solves  
 the problem partially with references, and truly static safe typed  
 language like ML solve this problem by making variables typesafe by  
 default, and using an Optional type to wrap nullable types.

 So my question, is, as the successor to C++, how does D solve this  
 problem? I'm looking through the D docs, that are a little sparse, but  
 I'm not seeing any references to pointers that can't be nulled.

 Brendan
Nullable types have been proposed several times by many people, but I don't recall any respond from Walter or Andrei :(
Nov 04 2008
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.
Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Nov 04 2008
next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Tue, Nov 4, 2008 at 3:32 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.
Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Dereferencing a null pointer is *always* a bug, it doesn't matter how "safe" it is. Don't you think that eliminating something that's always a bug at compile time is a worthwhile investment?
Nov 04 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Jarrett Billingsley wrote:
 Dereferencing a null pointer is *always* a bug, it doesn't matter how
 "safe" it is.
Sure. But I'm interested in creating a safe subset of D, and so the more correct interpretation of what constitutes "safety" is important.
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?
Not always. There's a commensurate increase in complexity that may not make it worth while. My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.
Nov 04 2008
next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?
Not always. There's a commensurate increase in complexity that may not make it worth while.
Have you looked at Delight at all? I wouldn't call the impact of nullable types on D "commensurate." It's probably far less than const, invariant, pure, and escape analysis.
 My focus is on eliminating bugs that cannot be reliably detected even at run
 time. This will be a big win for D.
Can you expand upon this a bit? What exactly are some bugs that can't be reliably detected at runtime other than memory corruption?
Nov 04 2008
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Jarrett Billingsley:
 Have you looked at Delight at all?
Beside the topic of nullable types you are discussing about, Delight's look is designed to appeal mostly to Python programmers (despite being just D2, a little sugared), and/or to people that care a lot about having a clean(er) syntax, so C/C++ programmers may be less interested... If Delight becomes refined and debugged enough, I hope to see it bundled by default with the LDC compiler, as well as Tango, a GUI toolkit like GTK for D, and few other goodies, like an editor/almost-IDE. I think it can become a way to "sell" D2 to other kind of programmers. Bye, bearophile
Nov 04 2008
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Jarrett Billingsley wrote:
 On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?
Not always. There's a commensurate increase in complexity that may not make it worth while.
Have you looked at Delight at all? I wouldn't call the impact of nullable types on D "commensurate." It's probably far less than const, invariant, pure, and escape analysis.
Sorry, I have not looked at Delight.
 My focus is on eliminating bugs that cannot be reliably detected even at run
 time. This will be a big win for D.
Can you expand upon this a bit? What exactly are some bugs that can't be reliably detected at runtime other than memory corruption?
Memory corruption is a big one. Another are sequential consistency bugs, then there's function hijacking.
Nov 04 2008
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Walter Bright wrote:
 Jarrett Billingsley wrote:
 Dereferencing a null pointer is *always* a bug, it doesn't matter how
 "safe" it is.
Sure. But I'm interested in creating a safe subset of D, and so the more correct interpretation of what constitutes "safety" is important.
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?
Not always. There's a commensurate increase in complexity that may not make it worth while. My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.
FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've often run into bugs that non-nullable types could have prevented (including one on a production system... well, there was another bug that raised an exception causing something else to be uninitialized and the system came crashing down).
Nov 04 2008
next sibling parent "Lionello Lunesu" <lionello lunesu.remove.com> writes:
"Robert Fraser" <fraserofthenight gmail.com> wrote in message 
news:geregs$sj1$1 digitalmars.com...
 FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've 
 often run into bugs that non-nullable types could have prevented 
 (including one on a production system... well, there was another bug that 
 raised an exception causing something else to be uninitialized and the 
 system came crashing down).
Hear hear! Nullness should have nothing to do with a type having reference or value semantics. These two concepts are orthogonal. L.
Nov 04 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 Walter Bright wrote:
 My focus is on eliminating bugs that cannot be reliably detected even 
 at run time. This will be a big win for D.
FWIW, I've _never_ run into a bug const could have prevented.
That isn't really the point of const. The point of const is to be able to write functions that can accommodate both mutable and invariant arguments. The point of invariantness is to be able to prove that code has certain properties. This is much better than relying on your programming team never making a mistake. For example, you can do functional programming in C++. It's just that the compiler cannot know you're doing that, and so cannot take advantage of it. Furthermore, the compiler cannot detect when code is not functional, and so if someone hands you a million lines of code you have no freakin' way of determining if it adheres to functional principles or not. This really matters when one starts doing concurrent programming.
Nov 05 2008
parent Nick B <nick.barbalich gmail.com> writes:
Can someone explain what is the plan for when Tango turns 1.0

Will the code be frozen ?

Will the version of D it runs with be frozen ?


cheers
Nick B
Nov 05 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Walter Bright" wrote
 Jarrett Billingsley wrote:
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?
Not always. There's a commensurate increase in complexity that may not make it worth while. My focus is on eliminating bugs that cannot be reliably detected even at run time. This will be a big win for D.
dereference errors. In D, it simply doesn't happen, because the array has a guard that is stored with the reference -- the length. I think these similar to the kinds of things that Jarrett is referring to. Something that's like a pointer, but can't ever be null, so you never have to check it for null before using it. Except Jarrett's idea eliminates it at compile time vs. run time. Couldn't one design a struct wrapper that implements this behavior? Something like: NonNullable!(T) { opAssign(T t) {/* make sure t is not null */} opAssign(NonNullable!(T) t) {/* no null check */} ... } -Steve
Nov 05 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 Couldn't one design a struct wrapper that implements this behavior? 
If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.
Nov 05 2008
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 For example, you should also be able to create a ranged int that can 
 only contain values from n to m:
 RangedInt!(N, M) i;
I presume you meant something like this: Ranged!(int, N, M) i; That has to work (assert and when possible statically assert) in situations like the following ones too: i = long.max; // static err i = ulong.max; // static err Ranged!(ulong, 0, ulong.max) ul, r1, r2; ul = ulong.max + 10; // static err r1 = ulong.max / 2 + 100; r2 = ulong.max / 2 + 100; r1 + r2 // runtime err As you have discussed about recently, for a compiler designer choosing what goes in the language and what to keep out of it and into libs is very important. Time ago I have read about a special Scheme *compiler* that is very small and very "pluggable", so it's very small, and a large part of the language is implemented by external libs, even a *static* type system, and several other things that usually are assumed part of the compiler itself. It's not just a matter of creating a very flexible compiler core: even if you somehow are able to create it, then there are other problems to be solved: designing languages is quite hard, so you can't expect an average programmer to be a good designer like that (that's also why AST macros may lead to some troubles). So if you push things out of the language, you probably have to put them into standard libs, so normal programmers can use a standard and well designed version of them. Otherwise it leads to a lot of troubles that I don't list now. How can we establish if ranged integral values have to be outside the compiler or inside? We can list requirements, etc. Generally the more things are pushed into the compiler, the more complex it becomes, slower to improve and mantain, and such features can also become more rigid (this can be seen very well with D unittests and ddoc. I think that eventually D unittests and ddoc may have to be removed to the language, and put into the standard library, and the language itself may have to grow some features (some more reflection? Maybe like Flectioned?) that allow the standard library code to define them with a handy & short syntax anyway. This to both reduce compiler complexity, allow more evolving capabilities to that functionalities, and allow the community of D programmers to improve them). Some features of ranged integrals: - Making integrals ranged has some different purposes, the main one is to avoid a class of runtime bugs, another purpose is to shorten some code a little. The final purpose is to have release code that has zero speed penalty compared to the D code of today. Some of those bugs are controlled by runtime code and other of them can probably be avoided at compile time, by the type system. The compiler can also avoid putting some runtime controls where it infers some values are into certain values. The code inside contracts (I mean of the contract programming) can be also used by the compiler to infer where it can remove more of those runtime controls. - A short handy syntax is important enough, because for such ranges to become part of the D culture they may have to be handy, short, etc. If D has some features that most D programmers don't use, then they become less useful. - Probably to avoid integral-related bugs the compiler and the runtime have to control all integral values used by the program, because letting the programmer use few of them in special points is probably a way to not see them used much. For the same purpose such controls probably need to be on (activated) by default, like array range controls. - Recently I have shown a possible syntax to disable/enable some controls locally, with a syntax like: safe(stack, ranges, arrays) {...} unsafe(ranges) {...} I think such syntax is better than the syntax used by ObjectPascal for such purposes. - Once and where disabled such controls must have to cost zero at runtime, because D is designed to be quick. I presume the SafeD language has to keep them always activated (that's why having the compiler remove some of them automatically or using the contracts is useful). - From the links I have shown here recently you can learn how much common is such class of integral-related bugs. And generally you can't talk about a "Safe-D" if you can't sum two integral values reliably :-) - Range types of integers/chars/enums are useful, but also are useful subranges, that are essentially subtypes specialized for just this purpose. So if: typedef int:1..6 TyDice; typedef TyDice:1..3 TyHalfDice; then a function that takes a TyDice automatically accepts a TyHalfDice too. Note that Haskell type system is so powerful that it allows the programmer to define such semantics, subtypes, etc. But D type system is quite more "primitive", so some of such things may be need to be cabled instead of being user-defined (by programmers that know a lot of type theory, of course). - So I think that while unittests and ddoc may be better out of the compiler, range types may be better into it (note that I use unittests and ddoc _all the time_, all my programs use them heavily, I like them. I am not saying this because I don't like unittests and documentation strings). Bye, bearophile
Nov 05 2008
prev sibling next sibling parent reply Hxal <hxal freenode.irc> writes:
Walter Bright wrote:
 If that cannot be done in D, then D needs some design improvements.
 Essentially, any type should be "wrappable" in a struct which can alter
 the behavior of the wrapped type.
 
 For example, you should also be able to create a ranged int that can
 only contain values from n to m:
 
 RangedInt!(N, M) i;
 
 Preserving this property of structs has driven many design choices in D,
 particularly with regards to how const fits into the type system.
Does that mean we're getting implicit cast overloads? Because without RangedInt!(N, M).opImplicitCastFrom(int i) you can't pass int values to functions accepting RangedInt instances. You can't pass a different RangedInt!(X, Y) either. It defeats the purpose of implicit range checking if you have to write litanies like foo(RangedInt!(1,10).check(i)) just to call a function. Sorry to jump the topic like that, but last time I asked my thread got hijacked. :P
Nov 06 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
Hxal wrote:
 Does that mean we're getting implicit cast overloads?
 Because without RangedInt!(N, M).opImplicitCastFrom(int i)
opImplicitCast, yes.
Nov 06 2008
prev sibling parent reply BCS <ao pathlink.com> writes:
Reply to Walter,

 Steven Schveighoffer wrote:
 
 Couldn't one design a struct wrapper that implements this behavior?
 
If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.
Why not explicitly support this with bodied typedefs? typedef int MyInt(int m, int M) { MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; } MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work but... { static if(m > m_ && M_ > M) assert(m <= i && i <= M); // only assert as needed else static if(m > m_) assert(m <= i); else static if(M < M_) assert(i <= M); this = i; } // this is only to define the return type, the normal code for int is still generated MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; }
Nov 06 2008
parent reply KennyTM~ <kennytm gmail.com> writes:
BCS wrote:
 Reply to Walter,
 
 Steven Schveighoffer wrote:

 Couldn't one design a struct wrapper that implements this behavior?
If that cannot be done in D, then D needs some design improvements. Essentially, any type should be "wrappable" in a struct which can alter the behavior of the wrapped type. For example, you should also be able to create a ranged int that can only contain values from n to m: RangedInt!(N, M) i; Preserving this property of structs has driven many design choices in D, particularly with regards to how const fits into the type system.
Why not explicitly support this with bodied typedefs? typedef int MyInt(int m, int M) { MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; } MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work but... { static if(m > m_ && M_ > M) assert(m <= i && i <= M); // only assert as needed else static if(m > m_) assert(m <= i); else static if(M < M_) assert(i <= M); this = i; } // this is only to define the return type, the normal code for int is still generated MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; }
Just create a struct with an int as its only member?
Nov 07 2008
parent BCS <ao pathlink.com> writes:
Reply to KennyTM~,

 Just create a struct with an int as its only member?
 
But if you do that then you have to explicitly build all of the math overloads. Sometimes, they ALL end up as simple shells so why force the programer to build them all? Also it forces the compiler to use the int code generator rather than potentially not inlineing.
Nov 07 2008
prev sibling next sibling parent reply cemiller <chris dprogramming.com> writes:
On Tue, 04 Nov 2008 12:32:59 -0800, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.
Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
null pointers DO cause memory corruption: byte* foo = null; // NULL! foo[1244916] = 5; // WORKS; CORRUPTS!
Nov 04 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
cemiller wrote:
 null pointers DO cause memory corruption:
 
    byte* foo = null;   // NULL!
    foo[1244916] = 5;   // WORKS; CORRUPTS!
Yes, but so will any pointer that you index out of bounds. That's why safe D will not allow arithmetic on pointers.
Nov 04 2008
prev sibling parent reply Brendan Miller <catphive catphive.net> writes:
Walter Bright Wrote:

 Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.
Yes, but those are neither type safe errors or memory safe errors. A null pointer is neither mistyped nor can it cause memory corruption.
Well.. I can't speak for null pointers in D, but they can definitely cause memory corruption in C++. Not all OS's have memory protection. *remembers the good old days of Mac OS system 7* Back to the important point! A couple of times in this thread I've seen people suggest that null pointers are type safe. I don't see how that statement is justifiable. People accept null because it's always been there for those of us who are long time C coders. What you have to remember, is C was not type safe in any way shape or form. First off, let's clarify that we're talking about *static* type safety. Languages like python are dynamically type safe because at runtime you will see an exception thrown if you try to perform an operation on a type that it does not support it. If you have a reference in python, you can point it to whatever the hell you want and the runtime will prevent you from performing the wrong operation on the wrong data. It's a more limited form of type checking than static type checking, but many people find this acceptable. In a statically typed language, it is *impossible* to perform an operation on a type that it does not support because at compile time you know the types of the objects. Concretely null is a pointer to address zero. For some type T, there is never any T at address zero. Therefor a statically typed language will prevent you from assigning a poitner to an object that is not of type T to a pointer decleared to be type T. That's *the entire point* of static typing. T* means "that which I point to is in the set of T". T sans the star means "I am in the set of T". Not sometimes. Not maybe. Always. Yes, you can also get performance benefits from type annotations... but that doesn't make the langauge statically type *safe*. Now of course, sometimes we do want to a pointer to type T to be null... but what does that *mean*? It means, you have a variable that sometimes you want to hold a pointer to T... and sometimes you don't want to hold a pointer to T. This is called a variant. Different languages implement variants in different ways and have different names for them. In C, they are called unions. C, again, is not type *safe* so if you try to treat a union as the wrong type, it will let you. However, in most langauges, variants provide dynamic typing for variants, and thus offer the lesser form of type safety. C and C++ pointers to T are variants of type T and the type of NULL. Except, of course, like unions they aren't type safe even dynamically because the runtime won't stop you from derefencing null. The operating system *will* stop you by killing your process, if you are on a system with protected memory because address zero is not accessible to userspace on most systems. *most* systems, not all. Think about this in terms of set theory and the idea should become clear. Null should not be assignable to a pointer to T because the object it points to at address zero does not lie within the set of T's. If it did lie within the set of T's, then this should be valid: T myObject; my Object = *NULL; It shouldn't even require a type cast because type casts are ways of breaking out of static typing. But it does in C++. In fact, this code generates: error: invalid type argument of `unary *' Damn right. Now, really, what's so hard about adding a statically type safe pointer? C++ already did it, and they are called references. My complaint here, after all, was that D is apparently less type safe than C++. Now, I have other problems with C++ references. That they have value semantics is just stupid (especially since they are *called* references!). Type safety and value vs reference semantics have nothing to do with one another. Indeed, added nullable value types. Brendan
Nov 04 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Brendan Miller wrote:
 Well.. I can't speak for null pointers in D, but they can definitely
 cause memory corruption in C++. Not all OS's have memory protection.
 *remembers the good old days of Mac OS system 7*
Those machines are obsolete, for excellent reasons <g>. If, for some reason, a D implementation needs to be implemented for such a machine, the solution is to optionally insert a runtime check analogously to array bounds checking.
 Concretely null is a pointer to address zero. For some type T, there
 is never any T at address zero. Therefor a statically typed language
 will prevent you from assigning a poitner to an object that is not of
 type T to a pointer decleared to be type T. That's *the entire point*
 of static typing. T* means "that which I point to is in the set of
 T". T sans the star means "I am in the set of T". Not sometimes. Not
 maybe. Always.
I understand your point, and it sounds right technically. But practically, I'm not convinced. For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?
Nov 04 2008
parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Tue, Nov 4, 2008 at 11:40 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 I understand your point, and it sounds right technically. But practically,
 I'm not convinced.

 For example, consider a linked list. How do you know you've reached the end
 of the list? By the pointer being null or pointing to some "impossible"
 object. If you pick the latter, what really have you gained over a null
 pointer?
The implication of non-nullable types isn't that nullable types disappear; quite the opposite, in fact. Nullable types have obvious use for exactly the reason you explain. The problem arises when nullable types are used in situations where it makes _no sense_ for null to appear. This is where bugs show up. In a system that has both nullable and non-null types, nullable types act as a sort of container, preventing you from accessing anything through them as it cannot be statically proven that the access will be legal at runtime. In order to access something from a nullable type, you have to convert it to a non-null type. Delight uses D's "declare a variable in the condition of an if or while" to great effect here: if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null { // f is known not to be null. } else { // something else happened. Handle it. } Null still has a purpose. It's just that its purpose is really only to signal a special case.
Nov 04 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Jarrett Billingsley wrote:
 The implication of non-nullable types isn't that nullable types
 disappear; quite the opposite, in fact.  Nullable types have obvious
 use for exactly the reason you explain.  The problem arises when
 nullable types are used in situations where it makes _no sense_ for
 null to appear.  This is where bugs show up.  In a system that has
 both nullable and non-null types, nullable types act as a sort of
 container, preventing you from accessing anything through them as it
 cannot be statically proven that the access will be legal at runtime.
 In order to access something from a nullable type, you have to convert
 it to a non-null type.  Delight uses D's "declare a variable in the
 condition of an if or while" to great effect here:
 
 if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
 {
     // f is known not to be null.
 }
 else
 {
     // something else happened.  Handle it.
 }
I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.
Nov 05 2008
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2008-11-05 03:18:50 -0500, Walter Bright <newshound1 digitalmars.com> said:

 I don't see what you've gained here. The compiler certainly can do flow 
 analysis in some cases to know that a pointer isn't null, but that 
 isn't generalizable. If a function takes a pointer parameter, no flow 
 analysis will tell you if it is null or not.
I'm not sure how you're reading things, but to me having two kinds of pointers (nullable and non-nullable) is exactly what you need to enable nullness flow analysis across function boundaries. Basically, if you declare some pointer to be restricted to not-null in a function signature, and then try to call the function by passing it a possibly null pointer, the compiler can tell you that you need to check for null at the call site before calling the function. It then ensue that when given a non-nullable pointer you can call a function requiring a non-nullable pointer directly without any check for null, because you know the pointer you recieved can't be null. Currently, you can acheive this with proper documentation of functions saying whether arguments accept null and if return values can return null, and write your code with those assumptions in mind. Most often than not however there is no such documentation and you find yourself checking for null a lot more than necessary. If this property about pointers in function parameters and return values were known to the compiler, the compiler could check for you that you're doing things correctly, warn you whenever you're forgetting a null check, and optimise away checks for null on these pointers. I know the null-dereferencing problem can generally be caught easily at runtime, but sometime your null comes from far away in the program (someone set a global to null for instance) and you're left to wonder who put a null value there in the first place. Non-nullable pointers would help a lot in those cases because you no longer have to test every code path and the error of giving a null value would be caught at the source (with the compiler telling you to check against null), not only where it's being dereferenced. - - - That said, I think this could be done using an template. Consider this: struct NotNullPtr(Type) { private Type* ptr; this(Type* ptr) { opAssign(ptr); } void opAssign(Type* ptr) { // if this gets inlined and you have already checked for null, then // hopefully the optimizer will remove this redundant check. if (ptr) this.ptr = ptr; else throw new Exception("Unacceptable null value."); } void opAssign(NotNullPtr other) { this.ptr = other.ptr; } Type* opCast() { return ptr; } ref Type opDeref() { return &ptr; } alias opDeref opStar; // ... implement the rest yourself } (not tested) You could use this template everywhere you want to be sure a pointer isn't null. It guarenties that its value will never be null, and will throw an exception at the source where you attempt to put a null value in it, not when you attempt to dereference it later, when it's too late and your program has already been put in an incorrect state. NotNullPtr!(int) globalThatShouldNotBeNull; int* foo(); globalThatShouldBeNull = foo(); // will throw if you attempt to set it to null. void bar(NotNullPtr!(int) arg); bar(globalThatShouldNotBeNull); // no need to check for null. The greatest downside to this template is that since it isn't part of the language, almost no one will use it in their function prototypes and return types. That's not counting that its syntax is verbose and not very appealing (although it's not much worse than boost::shared_ptr or std::auto_ptr). But still, if you have a global or member variable that must not be null, it can be of use; and if you have a function where you want to put the burden of checking for null on the caller, it can be of use. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 05 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Michel Fortin:
 Basically, if you declare some pointer to be restricted to not-null in 
 a function signature, and then try to call the function by passing it a 
 possibly null pointer, the compiler can tell you that you need to check 
 for null at the call site before calling the function.
 
 It then ensue that when given a non-nullable pointer you can call a 
 function requiring a non-nullable pointer directly without any check 
 for null, because you know the pointer you recieved can't be null.
 
 Currently, you can acheive this with proper documentation of functions 
 saying whether arguments accept null and if return values can return 
 null, and write your code with those assumptions in mind. Most often 
 than not however there is no such documentation and you find yourself 
 checking for null a lot more than necessary. If this property about 
 pointers in function parameters and return values were known to the 
 compiler, the compiler could check for you that you're doing things 
 correctly, warn you whenever you're forgetting a null check, and 
 optimise away checks for null on these pointers.
The same is true making integral values become range values. If I want to write a function that takes an iterable of results of throwing a dice, I can use an enum, or control every item of the iterable for being in range 1 - 6. If range values are available I can just: StatsResults stats(Dice[] throwing_results) { ... Where Dice is: typedef int:1..7 Dice; I then don't need to remember to control items for being in 1-6 inside stats(), and the control is pushed up, toward the place where that throwing_results was created (or where it comes from disk, user input, etc). This avoids some bugs and reduces some code. Bye, bearophile
Nov 05 2008
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2008-11-05 08:16:59 -0500, bearophile <bearophileHUGS lycos.com> said:

 The same is true making integral values become range values. If I want 
 to write a function that takes an iterable of results of throwing a 
 dice, I can use an enum, or control every item of the iterable for 
 being in range 1 - 6. If range values are available I can just:
 
 StatsResults stats(Dice[] throwing_results) { ...
 
 Where Dice is:
 typedef int:1..7 Dice;
 
 I then don't need to remember to control items for being in 1-6 inside 
 stats(), and the control is pushed up, toward the place where that 
 throwing_results was created (or where it comes from disk, user input, 
 etc). This avoids some bugs and reduces some code.
It's exactly the same thing, except that for numbers you may want much more than simple ranges. You could want non-zero numbers, odd or even numbers, square numbers, etc. I have the feeling that whatever the language try to restrict about numbers, it will never be enough. So my feeling is that this is better left to a template. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 06 2008
prev sibling next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Nov 5, 2008 at 3:18 AM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Jarrett Billingsley wrote:
 The implication of non-nullable types isn't that nullable types
 disappear; quite the opposite, in fact.  Nullable types have obvious
 use for exactly the reason you explain.  The problem arises when
 nullable types are used in situations where it makes _no sense_ for
 null to appear.  This is where bugs show up.  In a system that has
 both nullable and non-null types, nullable types act as a sort of
 container, preventing you from accessing anything through them as it
 cannot be statically proven that the access will be legal at runtime.
 In order to access something from a nullable type, you have to convert
 it to a non-null type.  Delight uses D's "declare a variable in the
 condition of an if or while" to great effect here:

 if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
 {
    // f is known not to be null.
 }
 else
 {
    // something else happened.  Handle it.
 }
I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.
What? Is your response in response to my post at all? I am not talking about flow analysis on "normal" pointer types. I am talking about the typing system actually being modified to allow a programmer to express the idea, with a _type_, and not with static checking, that a reference/pointer value _may not be null_. In a type system with non-null types, if a function takes a non-null parameter and you pass it a nullable pointer, _you get an error at compile time_. // foo takes a non-null int*. void foo(int* x) { writefln("%s", *x); } // bar returns a nullable int* - an int*?. int*? bar(int x) { if(x < 10) return new int(x); else return null; } foo(bar(3)); // compiler error, you can't pass a potentially null type into a parameter that can't be null, moron if(auto p = bar(3)) foo(p); // ok else throw new Exception("Wah wah wah bar returned null"); With nullable types, flow analysis doesn't have to be done. It is implicit in the types. It is mangled into function names. foo _cannot_ take a pointer that may be null. End of story.
Nov 05 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
Jarrett Billingsley wrote:
 With nullable types, flow analysis doesn't have to be done.  It is
 implicit in the types.  It is mangled into function names.  foo
 _cannot_ take a pointer that may be null.  End of story.
Sure, which is why I was puzzled at the example given, which is about something else entirely. What you're talking about is a type constructor to create another kind of pointer. It's a significant increase in complexity. That's why I was talking about complexity being a downside of this - there is a tradeoff.
Nov 05 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Wed, Nov 5, 2008 at 10:43 PM, Jarrett Billingsley
<jarrett.billingsley gmail.com> wrote:
 On Wed, Nov 5, 2008 at 3:18 AM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 Jarrett Billingsley wrote:
 cannot be statically proven that the access will be legal at runtime.
 In order to access something from a nullable type, you have to convert
 it to a non-null type.  Delight uses D's "declare a variable in the
 condition of an if or while" to great effect here:

 if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
 {
    // f is known not to be null.
 }
 else
 {
    // something else happened.  Handle it.
 }
I don't see what you've gained here. The compiler certainly can do flow analysis in some cases to know that a pointer isn't null, but that isn't generalizable. If a function takes a pointer parameter, no flow analysis will tell you if it is null or not.
What? Is your response in response to my post at all? I am not talking about flow analysis on "normal" pointer types. I am talking about the typing system actually being modified to allow a programmer to express the idea, with a _type_, and not with static checking, that a reference/pointer value _may not be null_. In a type system with non-null types, if a function takes a non-null parameter and you pass it a nullable pointer, _you get an error at compile time_. // foo takes a non-null int*. void foo(int* x) { writefln("%s", *x); } // bar returns a nullable int* - an int*?. int*? bar(int x) { if(x < 10) return new int(x); else return null; } foo(bar(3)); // compiler error, you can't pass a potentially null type into a parameter that can't be null, moron if(auto p = bar(3)) foo(p); // ok else throw new Exception("Wah wah wah bar returned null"); With nullable types, flow analysis doesn't have to be done. It is implicit in the types. It is mangled into function names. foo _cannot_ take a pointer that may be null. End of story.
I didn't really get what you meant the first time either. The thing about Delight's use of auto "to great effect" wasn't clear. I assumed it was basically the same as D's auto inside an if, but I see now that it's not. Looks like a run-time type deduction, even though its not really. Kinda neat. --bb
Nov 05 2008
prev sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Wed, Nov 5, 2008 at 9:33 AM, Bill Baxter <wbaxter gmail.com> wrote:
 I didn't really get what you meant the first time either.  The thing
 about Delight's use of auto "to great effect" wasn't clear.   I
 assumed it was basically the same as D's auto inside an if, but I see
 now that it's not.  Looks like a run-time type deduction, even though
 its not really.  Kinda neat.
It's almost the same as D's variable-inside-an-if, with the addition that you can use it to convert a nullable type to a non-null type. Hence, in: if(auto f = someFunctionThatCanReturnNull()) if someFunctionThatCanReturnNull returns an int*? (nullable pointer to int), typeof(f) will just be int* (non-null pointer to int), since in the scope of the if statement, f is provably non-null.
Nov 05 2008
prev sibling next sibling parent Brendan Miller <catphive catphive.net> writes:
Walter Bright Wrote:

 Brendan Miller wrote:
 Well.. I can't speak for null pointers in D, but they can definitely
 cause memory corruption in C++. Not all OS's have memory protection.
 *remembers the good old days of Mac OS system 7*
Those machines are obsolete, for excellent reasons <g>. If, for some reason, a D implementation needs to be implemented for such a machine, the solution is to optionally insert a runtime check analogously to array bounds checking.
You mean D isn't mean to run on embedded hardware? I thought it was a systems programming language? A lot of hardware today has no MMU, which as I understand it means you can't have memory protection. If you are only targeting x86 after the ability to have memory protection was added to the hardware... then you can make all kinds of assumptions I guess.
 
 Concretely null is a pointer to address zero. For some type T, there
 is never any T at address zero. Therefor a statically typed language
 will prevent you from assigning a poitner to an object that is not of
 type T to a pointer decleared to be type T. That's *the entire point*
 of static typing. T* means "that which I point to is in the set of
 T". T sans the star means "I am in the set of T". Not sometimes. Not
 maybe. Always.
I understand your point, and it sounds right technically. But practically, I'm not convinced. For example, consider a linked list. How do you know you've reached the end of the list? By the pointer being null or pointing to some "impossible" object. If you pick the latter, what really have you gained over a null pointer?
The short answer is you use a variant. It handles the the case where you would use null slightly better (because it is dynamically type safe). For a C style language, just having two kind of pointers might be more natural. Like maybe: safe T* object1; // does not permit null. unsafe T* object2; // does perfmit null. and then have a cast between them: object1 = (safe T*)object2; // This throws some kind of well defined exception if object2 is null. The long answer is that the best way to learn about type safety is to check out SML or Ocaml. These are a couple of the few truly statically type safe langauge. Languages like those introduced type safety, and ideas like generics and type inference. ML is to type safety as Smalltalk is to object orientation. You will probably never write a real world program in ML, but learning it is probably the best way to get a good understanding of where things like type safety and templates came from in the first place. Also the linked list example you give is actually *way easier* in a language like ML that supports variants. The reason for this is that variants can be used in conjunction with a pattern matching syntax, which is kind of like a switch statement on steroids. As a side note, I find it interesting that C++ templates are actually much more powerful than the ML style generics they emulate. Specifically, templates can do template metaprogramming, whereas I'm pretty sure this is not possible in any ML style languages.
Nov 05 2008
prev sibling parent Mike Hearn <mike plan99.net> writes:
 The basic problem is that it's hard to integrate a language with non-
 nullable types with libraries written without them (e.g. when auto-
 generating bindings with BCD). This would likely be a big problem for D, 
 since integrating with existing C and C++ code is a major feature.
Can't that be solved by reversing the syntax, ie, you mark variables that cannot be null rather than variables that can be. The compiler then requires you to prove the non-nullness of the value on that codepath (or cast it away with nonnull). I'd love to see non-null types in D2. Intuitively, it'd catch at compile time quite a few bugs I see in my programs (assuming a strong compiler analysis).
Nov 28 2008