digitalmars.D - null and type safety

Brendan Miller (15/15) Nov 03 2008 So I'm curious about D as a programming language, especially as it compa...

bearophile (7/9) Nov 03 2008 In a (system) language I want to be able to create tree data structures ...

Thomas Leonard (16/29) Nov 04 2008 Note that the maybe types in Delight are independent of the syntax

Andrei Alexandrescu (7/30) Nov 04 2008 It could be done if non-nullable pointers/references would be allowed in...

bearophile (6/7) Nov 04 2008 I think some languages add something to help that because:

Denis Koroskin (7/44) Nov 04 2008 Unfortunately, you can have null references in D:
Walter Bright (3/6) Nov 04 2008 Yes, but those are neither type safe errors or memory safe errors. A

Jarrett Billingsley (5/12) Nov 04 2008 Dereferencing a null pointer is *always* a bug, it doesn't matter how

Walter Bright (7/11) Nov 04 2008 Sure. But I'm interested in creating a safe subset of D, and so the more...

Jarrett Billingsley (7/13) Nov 04 2008 Have you looked at Delight at all? I wouldn't call the impact of

bearophile (5/6) Nov 04 2008 Beside the topic of nullable types you are discussing about, Delight's l...
Walter Bright (4/19) Nov 04 2008 Memory corruption is a big one. Another are sequential consistency bugs,...

Robert Fraser (6/21) Nov 04 2008 FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've...

Lionello Lunesu (6/11) Nov 04 2008 Hear hear!
Walter Bright (13/17) Nov 05 2008 That isn't really the point of const. The point of const is to be able

Nick B (5/5) Nov 05 2008 Can someone explain what is the plan for when Tango turns 1.0

Steven Schveighoffer (18/25) Nov 05 2008 I was working in C# today, and I realized one very excellent design

Walter Bright (9/10) Nov 05 2008 If that cannot be done in D, then D needs some design improvements.

bearophile (31/34) Nov 05 2008 I presume you meant something like this:
Hxal (10/21) Nov 06 2008 Does that mean we're getting implicit cast overloads?

Walter Bright (2/4) Nov 06 2008 opImplicitCast, yes.

BCS (18/34) Nov 06 2008 Why not explicitly support this with bodied typedefs?

KennyTM~ (2/43) Nov 07 2008 Just create a struct with an int as its only member?

BCS (5/7) Nov 07 2008 But if you do that then you have to explicitly build all of the math ove...

cemiller (5/11) Nov 04 2008 null pointers DO cause memory corruption:

Walter Bright (3/7) Nov 04 2008 Yes, but so will any pointer that you index out of bounds. That's why

Brendan Miller (20/27) Nov 04 2008 Well.. I can't speak for null pointers in D, but they can definitely cau...

Walter Bright (11/21) Nov 04 2008 Those machines are obsolete, for excellent reasons . If, for some

Jarrett Billingsley (23/29) Nov 04 2008 The implication of non-nullable types isn't that nullable types

Walter Bright (5/25) Nov 05 2008 I don't see what you've gained here. The compiler certainly can do flow

Michel Fortin (79/83) Nov 05 2008 I'm not sure how you're reading things, but to me having two kinds of

bearophile (8/26) Nov 05 2008 The same is true making integral values become range values. If I want t...

Michel Fortin (10/24) Nov 06 2008 It's exactly the same thing, except that for numbers you may want much

Jarrett Billingsley (23/49) Nov 05 2008 What? Is your response in response to my post at all? I am not

Walter Bright (6/9) Nov 05 2008 Sure, which is why I was puzzled at the example given, which is about

Bill Baxter (8/51) Nov 05 2008 I didn't really get what you meant the first time either. The thing
Jarrett Billingsley (8/13) Nov 05 2008 It's almost the same as D's variable-inside-an-if, with the addition

Brendan Miller (12/37) Nov 05 2008 You mean D isn't mean to run on embedded hardware? I thought it was a sy...
Mike Hearn (2/6) Nov 28 2008 Can't that be solved by reversing the syntax, ie, you mark variables tha...

Brendan Miller <catphive catphive.net> writes:

So I'm curious about D as a programming language, especially as it compares to
C++.

One problem that C++ made a partial effort to solve was that normal pointers in


Consider that null can always be assigned to a pointer or reference to type T
in those languages, and null is clearly *not* of type T, thus operations on a
variable denoted of type T, are doomed to fail.

T *myObject = null;
myObject->myMethod(); // fails, despite the fact that myObject is of type T
                                         // and myMethod is defined for type T.

Null is a holdover from C and has no place in a typesafe language. The
designers of C++ knew this, and so introduced the c++ reference type:

T &myObjectRef = ...;

which cannot be null.

T &myObjectRef = null; // fails at compile type
T &myObjectRef = *ptr; // if ptr is null, operation is "undefined".


purposes and never really thought through these issues (although I read an

occured to him too late to fix).

This is obviously a problem. Everyone knows that null pointer exceptions in

runtime errors. Furthermore, there's no reason whatsoever that these problems
can't be caught by the compiler in a strongly typed language like C++ that has
the idea of a non-nullable pointer. The whole point of type annoations is to
catch these errors before runtime after all. Otherwise it's just a lot of
useless typing. C++ partially solves the problem partially with references, and
truly static safe typed language like ML solve this problem by making variables
typesafe by default, and using an Optional type to wrap nullable types.

So my question, is, as the successor to C++, how does D solve this problem? I'm
looking through the D docs, that are a little sparse, but I'm not seeing any
references to pointers that can't be nulled.

Brendan

Nov 03 2008

bearophile <bearophileHUGS lycos.com> writes:

Brendan Miller:
Null is a holdover from C and has no place in a typesafe language.<

In a (system) language I want to be able to create tree data structures that
contain a cargo plus true pointers, and those pointers to structs can be null
in leaves.
Having a safe language is good, but I want 1 language that gives me sharp tools
too.

I'm looking through the D docs, that are a little sparse, but I'm not seeing
any references to pointers that can't be nulled.<

You may like the Delight language (it compiles to D2):
http://delight.sourceforge.net/null.html

Bye,
bearophile

Nov 03 2008

Thomas Leonard <talex5+d gmail.com> writes:

On Mon, 03 Nov 2008 17:04:48 -0500, bearophile wrote:

 Brendan Miller:
Null is a holdover from C and has no place in a typesafe language.<

 
 In a (system) language I want to be able to create tree data structures
 that contain a cargo plus true pointers, and those pointers to structs
 can be null in leaves. Having a safe language is good, but I want 1
 language that gives me sharp tools too.
 
I'm looking through the D docs, that are a little sparse, but I'm not
seeing any references to pointers that can't be nulled.<

 
 You may like the Delight language (it compiles to D2):
 http://delight.sourceforge.net/null.html

Note that the maybe types in Delight are independent of the syntax 
changes (apart from the ? type annotation) and you could easily enable 
them when compiling D code too (basically, just remove the code that 
disables this feature when it detects it's compiling D syntax source).

The basic problem is that it's hard to integrate a language with non-
nullable types with libraries written without them (e.g. when auto-
generating bindings with BCD). This would likely be a big problem for D, 
since integrating with existing C and C++ code is a major feature.

Even if you add the annotations manually to an existing library, you 
often get a poor API. e.g. this GLib function for copying a string:

  char*? g_strdup(const(char)*? s)

If you pass NULL in, you get NULL out. That's a useful convenience in C, 
but in Delight it forces you to check whether the result is null before 
you can use it. If the API had been designed for Delight in the first 
place, it wouldn't take or return a nullable type.

Nov 04 2008

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Thomas Leonard wrote:
 On Mon, 03 Nov 2008 17:04:48 -0500, bearophile wrote:
 
 Brendan Miller:
 Null is a holdover from C and has no place in a typesafe language.<

 In a (system) language I want to be able to create tree data structures
 that contain a cargo plus true pointers, and those pointers to structs
 can be null in leaves. Having a safe language is good, but I want 1
 language that gives me sharp tools too.

 I'm looking through the D docs, that are a little sparse, but I'm not
 seeing any references to pointers that can't be nulled.<

 You may like the Delight language (it compiles to D2):
 http://delight.sourceforge.net/null.html

 
 Note that the maybe types in Delight are independent of the syntax 
 changes (apart from the ? type annotation) and you could easily enable 
 them when compiling D code too (basically, just remove the code that 
 disables this feature when it detects it's compiling D syntax source).
 
 The basic problem is that it's hard to integrate a language with non-
 nullable types with libraries written without them (e.g. when auto-
 generating bindings with BCD). This would likely be a big problem for D, 
 since integrating with existing C and C++ code is a major feature.

It could be done if non-nullable pointers/references would be allowed in 
addition to nullable ones.

A note however - although I agree it's nice to have non-nullable types, 
the argument is a bit overstated as type safety has little to do with 
it. Null checks are easy and cheap to check for deterministically.


Andrei

Nov 04 2008

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:
 Null checks are easy and cheap to check for deterministically.

I think some languages add something to help that because:
- Sometimes you may forget to add those checks manually if you ned.
- If you don't need to put those checks the code results a little shorter and
cleaner. Some kind of programmers like this. C++ programmers probably care less
of this.

Bye,
bearophile

Nov 04 2008

"Denis Koroskin" <2korden gmail.com> writes:

On Tue, 04 Nov 2008 00:10:29 +0300, Brendan Miller <catphive catphive.net>  
wrote:

 So I'm curious about D as a programming language, especially as it  
 compares to C++.

 One problem that C++ made a partial effort to solve was that normal  

 essentially aren't type safe.

 Consider that null can always be assigned to a pointer or reference to  
 type T in those languages, and null is clearly *not* of type T, thus  
 operations on a variable denoted of type T, are doomed to fail.

 T *myObject = null;
 myObject->myMethod(); // fails, despite the fact that myObject is of  
 type T
                                          // and myMethod is defined for  
 type T.

 Null is a holdover from C and has no place in a typesafe language. The  
 designers of C++ knew this, and so introduced the c++ reference type:

 T &myObjectRef = ...;

 which cannot be null.

 T &myObjectRef = null; // fails at compile type
 T &myObjectRef = *ptr; // if ptr is null, operation is "undefined".

Unfortunately, you can have null references in D:
Object o = null;


 marketing purposes and never really thought through these issues  
 (although I read an interview where Anders Hejlsberg admitted this was a  




 This is obviously a problem. Everyone knows that null pointer exceptions  

 sources of runtime errors. Furthermore, there's no reason whatsoever  
 that these problems can't be caught by the compiler in a strongly typed  
 language like C++ that has the idea of a non-nullable pointer. The whole  
 point of type annoations is to catch these errors before runtime after  
 all. Otherwise it's just a lot of useless typing. C++ partially solves  
 the problem partially with references, and truly static safe typed  
 language like ML solve this problem by making variables typesafe by  
 default, and using an Optional type to wrap nullable types.

 So my question, is, as the successor to C++, how does D solve this  
 problem? I'm looking through the D docs, that are a little sparse, but  
 I'm not seeing any references to pointers that can't be nulled.

 Brendan

Nullable types have been proposed several times by many people, but I  
don't recall any respond from Walter or Andrei :(

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.

Yes, but those are neither type safe errors or memory safe errors. A 
null pointer is neither mistyped nor can it cause memory corruption.

Nov 04 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Tue, Nov 4, 2008 at 3:32 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.

 Yes, but those are neither type safe errors or memory safe errors. A null
 pointer is neither mistyped nor can it cause memory corruption.

Dereferencing a null pointer is *always* a bug, it doesn't matter how
"safe" it is.  Don't you think that eliminating something that's
always a bug at compile time is a worthwhile investment?

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Jarrett Billingsley wrote:
 Dereferencing a null pointer is *always* a bug, it doesn't matter how
 "safe" it is.

Sure. But I'm interested in creating a safe subset of D, and so the more 
correct interpretation of what constitutes "safety" is important.

 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?

Not always. There's a commensurate increase in complexity that may not 
make it worth while.

My focus is on eliminating bugs that cannot be reliably detected even at 
run time. This will be a big win for D.

Nov 04 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?

 Not always. There's a commensurate increase in complexity that may not make
 it worth while.

Have you looked at Delight at all?  I wouldn't call the impact of
nullable types on D "commensurate."  It's probably far less than
const, invariant, pure, and escape analysis.

 My focus is on eliminating bugs that cannot be reliably detected even at run
 time. This will be a big win for D.

Can you expand upon this a bit?  What exactly are some bugs that can't
be reliably detected at runtime other than memory corruption?

Nov 04 2008

bearophile <bearophileHUGS lycos.com> writes:

Jarrett Billingsley:
 Have you looked at Delight at all?

Beside the topic of nullable types you are discussing about, Delight's look is
designed to appeal mostly to Python programmers (despite being just D2, a
little sugared), and/or to people that care a lot about having a clean(er)
syntax, so C/C++ programmers may be less interested...

If Delight becomes refined and debugged enough, I hope to see it bundled by
default with the LDC compiler, as well as Tango, a GUI toolkit like GTK for D,
and few other goodies, like an editor/almost-IDE. I think it can become a way
to "sell" D2 to other kind of programmers.

Bye,
bearophile

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Jarrett Billingsley wrote:
 On Tue, Nov 4, 2008 at 5:31 PM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?

 Not always. There's a commensurate increase in complexity that may not make
 it worth while.

 
 Have you looked at Delight at all?  I wouldn't call the impact of
 nullable types on D "commensurate."  It's probably far less than
 const, invariant, pure, and escape analysis.

Sorry, I have not looked at Delight.


 My focus is on eliminating bugs that cannot be reliably detected even at run
 time. This will be a big win for D.

 
 Can you expand upon this a bit?  What exactly are some bugs that can't
 be reliably detected at runtime other than memory corruption?

Memory corruption is a big one. Another are sequential consistency bugs, 
then there's function hijacking.

Nov 04 2008

Robert Fraser <fraserofthenight gmail.com> writes:

Walter Bright wrote:
 Jarrett Billingsley wrote:
 Dereferencing a null pointer is *always* a bug, it doesn't matter how
 "safe" it is.

 
 Sure. But I'm interested in creating a safe subset of D, and so the more 
 correct interpretation of what constitutes "safety" is important.
 
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?

 
 Not always. There's a commensurate increase in complexity that may not 
 make it worth while.
 
 My focus is on eliminating bugs that cannot be reliably detected even at 
 run time. This will be a big win for D.

FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've 
often run into bugs that non-nullable types could have prevented 
(including one on a production system... well, there was another bug 
that raised an exception causing something else to be uninitialized and 
the system came crashing down).

Nov 04 2008

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Robert Fraser" <fraserofthenight gmail.com> wrote in message 
news:geregs$sj1$1 digitalmars.com...
 FWIW, I've _never_ run into a bug const could have prevented. OTOH, I've 
 often run into bugs that non-nullable types could have prevented 
 (including one on a production system... well, there was another bug that 
 raised an exception causing something else to be uninitialized and the 
 system came crashing down).

Hear hear!

Nullness should have nothing to do with a type having reference or value 
semantics. These two concepts are orthogonal.

L.

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Robert Fraser wrote:
 Walter Bright wrote:
 My focus is on eliminating bugs that cannot be reliably detected even 
 at run time. This will be a big win for D.

 FWIW, I've _never_ run into a bug const could have prevented.

That isn't really the point of const. The point of const is to be able 
to write functions that can accommodate both mutable and invariant 
arguments. The point of invariantness is to be able to prove that code 
has certain properties. This is much better than relying on your 
programming team never making a mistake.

For example, you can do functional programming in C++. It's just that 
the compiler cannot know you're doing that, and so cannot take advantage 
of it. Furthermore, the compiler cannot detect when code is not 
functional, and so if someone hands you a million lines of code you have 
no freakin' way of determining if it adheres to functional principles or 
not.

This really matters when one starts doing concurrent programming.

Nov 05 2008

Nick B <nick.barbalich gmail.com> writes:

Can someone explain what is the plan for when Tango turns 1.0

Will the code be frozen ?

Will the version of D it runs with be frozen ?


cheers
Nick B

Nov 05 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"Walter Bright" wrote
 Jarrett Billingsley wrote:
 Don't you think that eliminating something that's
 always a bug at compile time is a worthwhile investment?

 Not always. There's a commensurate increase in complexity that may not 
 make it worth while.

 My focus is on eliminating bugs that cannot be reliably detected even at 
 run time. This will be a big win for D.



dereference errors.  In D, it simply doesn't happen, because the array has a 
guard that is stored with the reference -- the length.  I think these 
similar to the kinds of things that Jarrett is referring to.  Something 
that's like a pointer, but can't ever be null, so you never have to check it 
for null before using it.  Except Jarrett's idea eliminates it at compile 
time vs. run time.

Couldn't one design a struct wrapper that implements this behavior? 
Something like:

NonNullable!(T)
{
   opAssign(T t) {/* make sure t is not null */}
   opAssign(NonNullable!(T) t) {/* no null check */}
   ...
}

-Steve

Nov 05 2008

Walter Bright <newshound1 digitalmars.com> writes:

Steven Schveighoffer wrote:
 Couldn't one design a struct wrapper that implements this behavior? 

If that cannot be done in D, then D needs some design improvements. 
Essentially, any type should be "wrappable" in a struct which can alter 
the behavior of the wrapped type.

For example, you should also be able to create a ranged int that can 
only contain values from n to m:

RangedInt!(N, M) i;

Preserving this property of structs has driven many design choices in D, 
particularly with regards to how const fits into the type system.

Nov 05 2008

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:
 For example, you should also be able to create a ranged int that can 
 only contain values from n to m:
 RangedInt!(N, M) i;

I presume you meant something like this:
Ranged!(int, N, M) i;

That has to work (assert and when possible statically assert) in situations
like the following ones too:

i = long.max; // static err
i = ulong.max; // static err

Ranged!(ulong, 0, ulong.max) ul, r1, r2;
ul = ulong.max + 10; // static err
r1 = ulong.max / 2 + 100;
r2 = ulong.max / 2 + 100;
r1 + r2 // runtime err

As you have discussed about recently, for a compiler designer choosing what
goes in the language and what to keep out of it and into libs is very
important. Time ago I have read about a special Scheme *compiler* that is very
small and very "pluggable", so it's very small, and a large part of the
language is implemented by external libs, even a *static* type system, and
several other things that usually are assumed part of the compiler itself.

It's not just a matter of creating a very flexible compiler core: even if you
somehow are able to create it, then there are other problems to be solved:
designing languages is quite hard, so you can't expect an average programmer to
be a good designer like that (that's also why AST macros may lead to some
troubles). So if you push things out of the language, you probably have to put
them into standard libs, so normal programmers can use a standard and well
designed version of them. Otherwise it leads to a lot of troubles that I don't
list now.

How can we establish if ranged integral values have to be outside the compiler
or inside? We can list requirements, etc. Generally the more things are pushed
into the compiler, the more complex it becomes, slower to improve and mantain,
and such features can also become more rigid (this can be seen very well with D
unittests and ddoc. I think that eventually D unittests and ddoc may have to be
removed to the language, and put into the standard library, and the language
itself may have to grow some features (some more reflection? Maybe like
Flectioned?) that allow the standard library code to define them with a handy &
short syntax anyway. This to both reduce compiler complexity, allow more
evolving capabilities to that functionalities, and allow the community of D
programmers to improve them).

Some features of ranged integrals:
- Making integrals ranged has some different purposes, the main one is to avoid
a class of runtime bugs, another purpose is to shorten some code a little. The
final purpose is to have release code that has zero speed penalty compared to
the D code of today. Some of those bugs are controlled by runtime code and
other of them can probably be avoided at compile time, by the type system. The
compiler can also avoid putting some runtime controls where it infers some
values are into certain values. The code inside contracts (I mean of the
contract programming) can be also used by the compiler to infer where it can
remove more of those runtime controls.
- A short handy syntax is important enough, because for such ranges to become
part of the D culture they may have to be handy, short, etc. If D has some
features that most D programmers don't use, then they become less useful.
- Probably to avoid integral-related bugs the compiler and the runtime have to
control all integral values used by the program, because letting the programmer
use few of them in special points is probably a way to not see them used much.
For the same purpose such controls probably need to be on (activated) by
default, like array range controls.
- Recently I have shown a possible syntax to disable/enable some controls
locally, with a syntax like:
safe(stack, ranges, arrays) {...}
unsafe(ranges) {...}
I think such syntax is better than the syntax used by ObjectPascal for such
purposes.
- Once and where disabled such controls must have to cost zero at runtime,
because D is designed to be quick. I presume the SafeD language has to keep
them always activated (that's why having the compiler remove some of them
automatically or using the contracts is useful).
- From the links I have shown here recently you can learn how much common is
such class of integral-related bugs. And generally you can't talk about a
"Safe-D" if you can't sum two integral values reliably :-)
- Range types of integers/chars/enums are useful, but also are useful
subranges, that are essentially subtypes specialized for just this purpose. So
if:
typedef int:1..6 TyDice;
typedef TyDice:1..3 TyHalfDice;
then a function that takes a TyDice automatically accepts a TyHalfDice too.
Note that Haskell type system is so powerful that it allows the programmer to
define such semantics, subtypes, etc. But D type system is quite more
"primitive", so some of such things may be need to be cabled instead of being
user-defined (by programmers that know a lot of type theory, of course).
- So I think that while unittests and ddoc may be better out of the compiler,
range types may be better into it (note that I use unittests and ddoc _all the
time_, all my programs use them heavily, I like them. I am not saying this
because I don't like unittests and documentation strings).

Bye,
bearophile

Nov 05 2008

Hxal <hxal freenode.irc> writes:

Walter Bright wrote:
 If that cannot be done in D, then D needs some design improvements.
 Essentially, any type should be "wrappable" in a struct which can alter
 the behavior of the wrapped type.
 
 For example, you should also be able to create a ranged int that can
 only contain values from n to m:
 
 RangedInt!(N, M) i;
 
 Preserving this property of structs has driven many design choices in D,
 particularly with regards to how const fits into the type system.

Does that mean we're getting implicit cast overloads?
Because without RangedInt!(N, M).opImplicitCastFrom(int i)
you can't pass int values to functions accepting RangedInt
instances. You can't pass a different RangedInt!(X, Y) either.
It defeats the purpose of implicit range checking if you
have to write litanies like foo(RangedInt!(1,10).check(i))
just to call a function.

Sorry to jump the topic like that, but last time I asked
my thread got hijacked. :P

Nov 06 2008

Walter Bright <newshound1 digitalmars.com> writes:

Hxal wrote:
 Does that mean we're getting implicit cast overloads?
 Because without RangedInt!(N, M).opImplicitCastFrom(int i)

opImplicitCast, yes.

Nov 06 2008

BCS <ao pathlink.com> writes:

Reply to Walter,

 Steven Schveighoffer wrote:
 
 Couldn't one design a struct wrapper that implements this behavior?
 

 If that cannot be done in D, then D needs some design improvements.
 Essentially, any type should be "wrappable" in a struct which can
 alter the behavior of the wrapped type.
 
 For example, you should also be able to create a ranged int that can
 only contain values from n to m:
 
 RangedInt!(N, M) i;
 
 Preserving this property of structs has driven many design choices in
 D, particularly with regards to how const fits into the type system.
 

Why not explicitly support this with bodied typedefs? 



typedef int MyInt(int m, int M)
{
    MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; }

    MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't work 
but...
    {
       static if(m > m_ && M_ > M) assert(m <= i && i <= M);  // only assert 
as needed
       else static if(m > m_) assert(m <= i);
       else static if(M < M_) assert(i <= M);
       this = i;
    }

    // this is only to define the return type, the normal code for int is 
still generated
    MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; 
}

Nov 06 2008

KennyTM~ <kennytm gmail.com> writes:

BCS wrote:
 Reply to Walter,
 
 Steven Schveighoffer wrote:

 Couldn't one design a struct wrapper that implements this behavior?

 If that cannot be done in D, then D needs some design improvements.
 Essentially, any type should be "wrappable" in a struct which can
 alter the behavior of the wrapped type.

 For example, you should also be able to create a ranged int that can
 only contain values from n to m:

 RangedInt!(N, M) i;

 Preserving this property of structs has driven many design choices in
 D, particularly with regards to how const fits into the type system.

 
 Why not explicitly support this with bodied typedefs?
 
 
 typedef int MyInt(int m, int M)
 {
    MyInt opAssign(int i) { assert(m <= i && i <= M); this = i; }
 
    MyInt opAssign(int m_, int M_)(MyInt!(m_,M_) i) // <- that doesn't 
 work but...
    {
       static if(m > m_ && M_ > M) assert(m <= i && i <= M);  // only 
 assert as needed
       else static if(m > m_) assert(m <= i);
       else static if(M < M_) assert(i <= M);
       this = i;
    }
 
    // this is only to define the return type, the normal code for int is 
 still generated
    MyInt!(m+m_, M+M_) opAdd(int m_, int M_)(MyInt!(m_,M_) i) = this; }
 
 

Just create a struct with an int as its only member?

Nov 07 2008

BCS <ao pathlink.com> writes:

Reply to KennyTM~,

 Just create a struct with an int as its only member?
 

But if you do that then you have to explicitly build all of the math overloads. 
Sometimes, they ALL end up as simple shells so why force the programer to 
build them all? Also it forces the compiler to use the int code generator 
rather than potentially not inlineing.

Nov 07 2008

cemiller <chris dprogramming.com> writes:

On Tue, 04 Nov 2008 12:32:59 -0800, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.

 Yes, but those are neither type safe errors or memory safe errors. A  
 null pointer is neither mistyped nor can it cause memory corruption.

null pointers DO cause memory corruption:

    byte* foo = null;   // NULL!
    foo[1244916] = 5;   // WORKS; CORRUPTS!

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

cemiller wrote:
 null pointers DO cause memory corruption:
 
    byte* foo = null;   // NULL!
    foo[1244916] = 5;   // WORKS; CORRUPTS!

Yes, but so will any pointer that you index out of bounds. That's why 
safe D will not allow arithmetic on pointers.

Nov 04 2008

Brendan Miller <catphive catphive.net> writes:

Walter Bright Wrote:

 Brendan Miller wrote:
 This is obviously a problem. Everyone knows that null pointer

 the biggest sources of runtime errors.

 
 Yes, but those are neither type safe errors or memory safe errors. A 
 null pointer is neither mistyped nor can it cause memory corruption.

Well.. I can't speak for null pointers in D, but they can definitely cause
memory corruption in C++. Not all OS's have memory protection. *remembers the
good old days of Mac OS system 7*

Back to the important point!

A couple of times in this thread I've seen people suggest that null pointers
are type safe. I don't see how that statement is justifiable. People accept
null because it's always been there for those of us who are long time C coders.
What you have to remember, is C was not type safe in any way shape or form.

First off, let's clarify that we're talking about *static* type safety.
Languages like python are dynamically type safe because at runtime you will see
an exception thrown if you try to perform an operation on a type that it does
not support it. If you have a reference in python, you can point it to whatever
the hell you want and the runtime will prevent you from performing the wrong
operation on the wrong data. It's a more limited form of type checking than
static type checking, but many people find this acceptable.

In a statically typed language, it is *impossible* to perform an operation on a
type that it does not support because at compile time you know the types of the
objects.

Concretely null is a pointer to address zero. For some type T, there is never
any T at address zero. Therefor a statically typed language will prevent you
from assigning a poitner to an object that is not of type T to a pointer
decleared to be type T. That's *the entire point* of static typing. T* means
"that which I point to is in the set of T". T sans the star means "I am in the
set of T". Not sometimes. Not maybe. Always.

Yes, you can also get performance benefits from type annotations... but that
doesn't make the langauge statically type *safe*.

Now of course, sometimes we do want to a pointer to type T to be null... but
what does that *mean*? It means, you have a variable that sometimes you want to
hold a pointer to T... and sometimes you don't want to hold a pointer to T.

This is called a variant. Different languages implement variants in different
ways and have different names for them. In C, they are called unions. C, again,
is not type *safe* so if you try to treat a union as the wrong type, it will
let you. However, in most langauges, variants provide dynamic typing for
variants, and thus offer the lesser form of type safety.

C and C++ pointers to T are variants of type T and the type of NULL. Except, of
course, like unions they aren't type safe even dynamically because the runtime
won't stop you from derefencing null. The operating system *will* stop you by
killing your process, if you are on a system with protected memory because
address zero is not accessible to userspace on most systems. *most* systems,
not all.

Think about this in terms of set theory and the idea should become clear. Null
should not be assignable to a pointer to T because the object it points to at
address zero does not lie within the set of T's. If it did lie within the set
of T's, then this should be valid:

T myObject;
my Object = *NULL;

It shouldn't even require a type cast because type casts are ways of breaking
out of static typing. But it does in C++. In fact, this code generates:

error: invalid type argument of `unary *'

Damn right.

Now, really, what's so hard about adding a statically type safe pointer? C++
already did it, and they are called references. My complaint here, after all,
was that D is apparently less type safe than C++.

Now, I have other problems with C++ references. That they have value semantics
is just stupid (especially since they are *called* references!). Type safety
and value  vs reference semantics have nothing to do with one another. Indeed,

added nullable value types.

Brendan

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Brendan Miller wrote:
 Well.. I can't speak for null pointers in D, but they can definitely
 cause memory corruption in C++. Not all OS's have memory protection.
 *remembers the good old days of Mac OS system 7*

Those machines are obsolete, for excellent reasons <g>. If, for some 
reason, a D implementation needs to be implemented for such a machine, 
the solution is to optionally insert a runtime check analogously to 
array bounds checking.

 Concretely null is a pointer to address zero. For some type T, there
 is never any T at address zero. Therefor a statically typed language
 will prevent you from assigning a poitner to an object that is not of
 type T to a pointer decleared to be type T. That's *the entire point*
 of static typing. T* means "that which I point to is in the set of
 T". T sans the star means "I am in the set of T". Not sometimes. Not
 maybe. Always.

I understand your point, and it sounds right technically. But 
practically, I'm not convinced.

For example, consider a linked list. How do you know you've reached the 
end of the list? By the pointer being null or pointing to some 
"impossible" object. If you pick the latter, what really have you gained 
over a null pointer?

Nov 04 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Tue, Nov 4, 2008 at 11:40 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 I understand your point, and it sounds right technically. But practically,
 I'm not convinced.

 For example, consider a linked list. How do you know you've reached the end
 of the list? By the pointer being null or pointing to some "impossible"
 object. If you pick the latter, what really have you gained over a null
 pointer?

The implication of non-nullable types isn't that nullable types
disappear; quite the opposite, in fact.  Nullable types have obvious
use for exactly the reason you explain.  The problem arises when
nullable types are used in situations where it makes _no sense_ for
null to appear.  This is where bugs show up.  In a system that has
both nullable and non-null types, nullable types act as a sort of
container, preventing you from accessing anything through them as it
cannot be statically proven that the access will be legal at runtime.
In order to access something from a nullable type, you have to convert
it to a non-null type.  Delight uses D's "declare a variable in the
condition of an if or while" to great effect here:

if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
{
    // f is known not to be null.
}
else
{
    // something else happened.  Handle it.
}

Null still has a purpose.  It's just that its purpose is really only
to signal a special case.

Nov 04 2008

Walter Bright <newshound1 digitalmars.com> writes:

Jarrett Billingsley wrote:
 The implication of non-nullable types isn't that nullable types
 disappear; quite the opposite, in fact.  Nullable types have obvious
 use for exactly the reason you explain.  The problem arises when
 nullable types are used in situations where it makes _no sense_ for
 null to appear.  This is where bugs show up.  In a system that has
 both nullable and non-null types, nullable types act as a sort of
 container, preventing you from accessing anything through them as it
 cannot be statically proven that the access will be legal at runtime.
 In order to access something from a nullable type, you have to convert
 it to a non-null type.  Delight uses D's "declare a variable in the
 condition of an if or while" to great effect here:
 
 if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
 {
     // f is known not to be null.
 }
 else
 {
     // something else happened.  Handle it.
 }

I don't see what you've gained here. The compiler certainly can do flow 
analysis in some cases to know that a pointer isn't null, but that isn't 
generalizable. If a function takes a pointer parameter, no flow analysis 
will tell you if it is null or not.

Nov 05 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-11-05 03:18:50 -0500, Walter Bright <newshound1 digitalmars.com> said:

 I don't see what you've gained here. The compiler certainly can do flow 
 analysis in some cases to know that a pointer isn't null, but that 
 isn't generalizable. If a function takes a pointer parameter, no flow 
 analysis will tell you if it is null or not.

I'm not sure how you're reading things, but to me having two kinds of 
pointers (nullable and non-nullable) is exactly what you need to enable 
nullness flow analysis across function boundaries.

Basically, if you declare some pointer to be restricted to not-null in 
a function signature, and then try to call the function by passing it a 
possibly null pointer, the compiler can tell you that you need to check 
for null at the call site before calling the function.

It then ensue that when given a non-nullable pointer you can call a 
function requiring a non-nullable pointer directly without any check 
for null, because you know the pointer you recieved can't be null.

Currently, you can acheive this with proper documentation of functions 
saying whether arguments accept null and if return values can return 
null, and write your code with those assumptions in mind. Most often 
than not however there is no such documentation and you find yourself 
checking for null a lot more than necessary. If this property about 
pointers in function parameters and return values were known to the 
compiler, the compiler could check for you that you're doing things 
correctly, warn you whenever you're forgetting a null check, and 
optimise away checks for null on these pointers.

I know the null-dereferencing problem can generally be caught easily at 
runtime, but sometime your null comes from far away in the program 
(someone set a global to null for instance) and you're left to wonder 
who put a null value there in the first place. Non-nullable pointers 
would help a lot in those cases because you no longer have to test 
every code path and the error of giving a null value would be caught at 
the source (with the compiler telling you to check against null), not 
only where it's being dereferenced.

 - - -

That said, I think this could be done using an template. Consider this:

	struct NotNullPtr(Type) {
		private Type* ptr;
		this(Type* ptr) {
			opAssign(ptr);
		}
		void opAssign(Type* ptr) {
			// if this gets inlined and you have already checked for null, then
			// hopefully the optimizer will remove this redundant check.
			if (ptr)
				this.ptr = ptr;
			else
				throw new Exception("Unacceptable null value.");
		}
		void opAssign(NotNullPtr other) {
			this.ptr = other.ptr;
		}
		Type* opCast() {
			return ptr;
		}
		ref Type opDeref() {
			return &ptr;
		}
		alias opDeref opStar;
		// ... implement the rest yourself
	}

(not tested)

You could use this template everywhere you want to be sure a pointer 
isn't null. It guarenties that its value will never be null, and will 
throw an exception at the source where you attempt to put a null value 
in it, not when you attempt to dereference it later, when it's too late 
and your program has already been put in an incorrect state.

	NotNullPtr!(int) globalThatShouldNotBeNull;

	int* foo();
	globalThatShouldBeNull = foo(); // will throw if you attempt to set it 
to null.

	void bar(NotNullPtr!(int) arg);
	bar(globalThatShouldNotBeNull); // no need to check for null.

The greatest downside to this template is that since it isn't part of 
the language, almost no one will use it in their function prototypes 
and return types. That's not counting that its syntax is verbose and 
not very appealing (although it's not much worse than boost::shared_ptr 
or std::auto_ptr).

But still, if you have a global or member variable that must not be 
null, it can be of use; and if you have a function where you want to 
put the burden of checking for null on the caller, it can be of use.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Nov 05 2008

bearophile <bearophileHUGS lycos.com> writes:

Michel Fortin:
 Basically, if you declare some pointer to be restricted to not-null in 
 a function signature, and then try to call the function by passing it a 
 possibly null pointer, the compiler can tell you that you need to check 
 for null at the call site before calling the function.
 
 It then ensue that when given a non-nullable pointer you can call a 
 function requiring a non-nullable pointer directly without any check 
 for null, because you know the pointer you recieved can't be null.
 
 Currently, you can acheive this with proper documentation of functions 
 saying whether arguments accept null and if return values can return 
 null, and write your code with those assumptions in mind. Most often 
 than not however there is no such documentation and you find yourself 
 checking for null a lot more than necessary. If this property about 
 pointers in function parameters and return values were known to the 
 compiler, the compiler could check for you that you're doing things 
 correctly, warn you whenever you're forgetting a null check, and 
 optimise away checks for null on these pointers.

The same is true making integral values become range values. If I want to write
a function that takes an iterable of results of throwing a dice, I can use an
enum, or control every item of the iterable for being in range 1 - 6. If range
values are available I can just:

StatsResults stats(Dice[] throwing_results) { ...

Where Dice is:
typedef int:1..7 Dice;

I then don't need to remember to control items for being in 1-6 inside stats(),
and the control is pushed up, toward the place where that throwing_results was
created (or where it comes from disk, user input, etc). This avoids some bugs
and reduces some code.

Bye,
bearophile

Nov 05 2008

Michel Fortin <michel.fortin michelf.com> writes:

On 2008-11-05 08:16:59 -0500, bearophile <bearophileHUGS lycos.com> said:

 The same is true making integral values become range values. If I want 
 to write a function that takes an iterable of results of throwing a 
 dice, I can use an enum, or control every item of the iterable for 
 being in range 1 - 6. If range values are available I can just:
 
 StatsResults stats(Dice[] throwing_results) { ...
 
 Where Dice is:
 typedef int:1..7 Dice;
 
 I then don't need to remember to control items for being in 1-6 inside 
 stats(), and the control is pushed up, toward the place where that 
 throwing_results was created (or where it comes from disk, user input, 
 etc). This avoids some bugs and reduces some code.

It's exactly the same thing, except that for numbers you may want much 
more than simple ranges. You could want non-zero numbers, odd or even 
numbers, square numbers, etc. I have the feeling that whatever the 
language try to restrict about numbers, it will never be enough. So my 
feeling is that this is better left to a template.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Nov 06 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Wed, Nov 5, 2008 at 3:18 AM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Jarrett Billingsley wrote:
 The implication of non-nullable types isn't that nullable types
 disappear; quite the opposite, in fact.  Nullable types have obvious
 use for exactly the reason you explain.  The problem arises when
 nullable types are used in situations where it makes _no sense_ for
 null to appear.  This is where bugs show up.  In a system that has
 both nullable and non-null types, nullable types act as a sort of
 container, preventing you from accessing anything through them as it
 cannot be statically proven that the access will be legal at runtime.
 In order to access something from a nullable type, you have to convert
 it to a non-null type.  Delight uses D's "declare a variable in the
 condition of an if or while" to great effect here:

 if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
 {
    // f is known not to be null.
 }
 else
 {
    // something else happened.  Handle it.
 }

 I don't see what you've gained here. The compiler certainly can do flow
 analysis in some cases to know that a pointer isn't null, but that isn't
 generalizable. If a function takes a pointer parameter, no flow analysis
 will tell you if it is null or not.

What?  Is your response in response to my post at all?  I am not
talking about flow analysis on "normal" pointer types.  I am talking
about the typing system actually being modified to allow a programmer
to express the idea, with a _type_, and not with static checking, that
a reference/pointer value _may not be null_.

In a type system with non-null types, if a function takes a non-null
parameter and you pass it a nullable pointer, _you get an error at
compile time_.

// foo takes a non-null int*.
void foo(int* x) { writefln("%s", *x); }

// bar returns a nullable int* - an int*?.
int*? bar(int x) { if(x < 10) return new int(x); else return null; }

foo(bar(3)); // compiler error, you can't pass a potentially null type
into a parameter that can't be null, moron

if(auto p = bar(3))
    foo(p); // ok
else
    throw new Exception("Wah wah wah bar returned null");

With nullable types, flow analysis doesn't have to be done.  It is
implicit in the types.  It is mangled into function names.  foo
_cannot_ take a pointer that may be null.  End of story.

Nov 05 2008

Walter Bright <newshound1 digitalmars.com> writes:

Jarrett Billingsley wrote:
 With nullable types, flow analysis doesn't have to be done.  It is
 implicit in the types.  It is mangled into function names.  foo
 _cannot_ take a pointer that may be null.  End of story.

Sure, which is why I was puzzled at the example given, which is about 
something else entirely.

What you're talking about is a type constructor to create another kind 
of pointer. It's a significant increase in complexity. That's why I was 
talking about complexity being a downside of this - there is a tradeoff.

Nov 05 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Wed, Nov 5, 2008 at 10:43 PM, Jarrett Billingsley
<jarrett.billingsley gmail.com> wrote:
 On Wed, Nov 5, 2008 at 3:18 AM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 Jarrett Billingsley wrote:
 cannot be statically proven that the access will be legal at runtime.
 In order to access something from a nullable type, you have to convert
 it to a non-null type.  Delight uses D's "declare a variable in the
 condition of an if or while" to great effect here:

 if(auto f = someFuncThatReturnsNullableFoo()) // f is declared as non-null
 {
    // f is known not to be null.
 }
 else
 {
    // something else happened.  Handle it.
 }

 I don't see what you've gained here. The compiler certainly can do flow
 analysis in some cases to know that a pointer isn't null, but that isn't
 generalizable. If a function takes a pointer parameter, no flow analysis
 will tell you if it is null or not.

 What?  Is your response in response to my post at all?  I am not
 talking about flow analysis on "normal" pointer types.  I am talking
 about the typing system actually being modified to allow a programmer
 to express the idea, with a _type_, and not with static checking, that
 a reference/pointer value _may not be null_.

 In a type system with non-null types, if a function takes a non-null
 parameter and you pass it a nullable pointer, _you get an error at
 compile time_.

 // foo takes a non-null int*.
 void foo(int* x) { writefln("%s", *x); }

 // bar returns a nullable int* - an int*?.
 int*? bar(int x) { if(x < 10) return new int(x); else return null; }

 foo(bar(3)); // compiler error, you can't pass a potentially null type
 into a parameter that can't be null, moron

 if(auto p = bar(3))
    foo(p); // ok
 else
    throw new Exception("Wah wah wah bar returned null");

 With nullable types, flow analysis doesn't have to be done.  It is
 implicit in the types.  It is mangled into function names.  foo
 _cannot_ take a pointer that may be null.  End of story.

I didn't really get what you meant the first time either.  The thing
about Delight's use of auto "to great effect" wasn't clear.   I
assumed it was basically the same as D's auto inside an if, but I see
now that it's not.  Looks like a run-time type deduction, even though
its not really.  Kinda neat.

--bb

Nov 05 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Wed, Nov 5, 2008 at 9:33 AM, Bill Baxter <wbaxter gmail.com> wrote:
 I didn't really get what you meant the first time either.  The thing
 about Delight's use of auto "to great effect" wasn't clear.   I
 assumed it was basically the same as D's auto inside an if, but I see
 now that it's not.  Looks like a run-time type deduction, even though
 its not really.  Kinda neat.

It's almost the same as D's variable-inside-an-if, with the addition
that you can use it to convert a nullable type to a non-null type.
Hence, in:

if(auto f = someFunctionThatCanReturnNull())

if someFunctionThatCanReturnNull returns an int*? (nullable pointer to
int), typeof(f) will just be int* (non-null pointer to int), since in
the scope of the if statement, f is provably non-null.

Nov 05 2008

Brendan Miller <catphive catphive.net> writes:

Walter Bright Wrote:

 Brendan Miller wrote:
 Well.. I can't speak for null pointers in D, but they can definitely
 cause memory corruption in C++. Not all OS's have memory protection.
 *remembers the good old days of Mac OS system 7*

 
 Those machines are obsolete, for excellent reasons <g>. If, for some 
 reason, a D implementation needs to be implemented for such a machine, 
 the solution is to optionally insert a runtime check analogously to 
 array bounds checking.

You mean D isn't mean to run on embedded hardware? I thought it was a systems
programming language? A lot of hardware today has no MMU, which as I understand
it means you can't have memory protection.

If you are only targeting x86 after the ability to have memory protection was
added to the hardware... then you can make all kinds of assumptions I guess.

 
 Concretely null is a pointer to address zero. For some type T, there
 is never any T at address zero. Therefor a statically typed language
 will prevent you from assigning a poitner to an object that is not of
 type T to a pointer decleared to be type T. That's *the entire point*
 of static typing. T* means "that which I point to is in the set of
 T". T sans the star means "I am in the set of T". Not sometimes. Not
 maybe. Always.

 
 I understand your point, and it sounds right technically. But 
 practically, I'm not convinced.
 
 For example, consider a linked list. How do you know you've reached the 
 end of the list? By the pointer being null or pointing to some 
 "impossible" object. If you pick the latter, what really have you gained 
 over a null pointer?

The short answer is you use a variant. It handles the the case where you would
use null slightly better (because it is dynamically type safe). For a C style
language, just having two kind of pointers might be more natural. Like maybe:

safe T* object1; // does not permit null.
unsafe T* object2; // does perfmit null.

and then have a cast between them:
object1 = (safe T*)object2; // This throws some kind of well defined exception
if object2 is null.

The long answer is that the best way to learn about type safety is to check out
SML or Ocaml. These are a couple of the few truly statically type safe
langauge. Languages like those introduced type safety, and ideas like generics
and type inference.

ML is to type safety as Smalltalk is to object orientation. You will probably
never write a real world program in ML, but learning it is probably the best
way to get a good understanding of where things like type safety and templates
came from in the first place.

Also the linked list example you give is actually *way easier* in a language
like ML that supports variants. The reason for this is that variants can be
used in conjunction with a pattern matching syntax, which is kind of like a
switch statement on steroids.

As a side note, I find it interesting that C++ templates are actually much more
powerful than the ML style generics they emulate. Specifically, templates can
do template metaprogramming, whereas I'm pretty sure this is not possible in
any ML style languages.

Nov 05 2008

Mike Hearn <mike plan99.net> writes:

 The basic problem is that it's hard to integrate a language with non-
 nullable types with libraries written without them (e.g. when auto-
 generating bindings with BCD). This would likely be a big problem for D, 
 since integrating with existing C and C++ code is a major feature.

Can't that be solved by reversing the syntax, ie, you mark variables that
cannot be null rather than variables that can be. The compiler then requires
you to prove the non-nullness of the value on that codepath (or cast it away
with nonnull).

I'd love to see non-null types in D2. Intuitively, it'd catch at compile time
quite a few bugs I see in my programs (assuming a strong compiler analysis).

Nov 28 2008

D Programming

C/C++ Programming

Other

digitalmars.D - null and type safety