www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Some memory safety

reply bearophile <bearophileHUGS lycos.com> writes:
My theory is that catching some bugs early is better than catching less bugs
early, or none.

The following example are D1, but I think the situation for D2 is the same.

-----------------

#1) Today this bug is found at compile-time, good:

void main() {
    int[5] a;
    a[6] = 1;
}

-----------------

#2) Today if not compiled with -release, if n=10 and m=15, the following
program raises a ArrayBoundsError at compile time:

import std.conv: toInt;
void main(string[] args) {
    int n = args.length >= 2 ? toInt(args[1]) : 10;
    int m = args.length >= 3 ? toInt(args[2]) : 5;
    auto a = new int[n];
    a[m] = 10;
}

-----------------

#3) The following bug too is found at compile-time, but the error message shows
that the index is converted silenty into an unsigned integer, so this less good:

void main() {
    int[5] a;
    a[-2] = 1;
}

temp.d(3): Error: array index 4294967294 is out of bounds a[0 .. 5]

-----------------

#4) The compiler doesn't find the following bugs at compile-time. Conceptually
it's the same as #1, so I'd like the compiler to refuse this at compile time in
many situations:

import std.c.stdlib: malloc;
void main() {
    int* a = cast(int*)malloc(int.sizeof * 5);
    a[10] = 1;
}


import std.c.stdlib: malloc;
void main() {
    byte* b = cast(byte*)malloc(int.sizeof * 5);
    b[25] = 1;
}

Newly written D code that uses malloc is less common, but you can often find
malloc in C code ported to D.

-----------------

#5) The compiler doesn't find this out of bounds at run time if n=10 and m=15,
but it's not much different from #2. The compiler when not in release mode can
keep a run-time variable that stores the length of the memory zone, and
produces something like MemoryBoundsError when accessed outside:

import std.c.stdlib: malloc;
import std.conv: toInt;
void main(string[] args) {
    int n = args.length >= 2 ? toInt(args[1]) : 10;
    int m = args.length >= 3 ? toInt(args[2]) : 5;
    int* a = cast(int*)malloc(int.sizeof * n);
    a[m] = 10;
}

(Well, in this situation the compiler may even note that such hidden length
variable is equal to n, but I generally don't need a compiler that smart).

-----------------

5b#) This bug is a bit less easy to find at compile time:

import std.c.stdlib: malloc;
import std.conv: toInt;
void foo(short* a, int m) {
    a[m] = 10;
}
void main(string[] args) {
    int n = args.length >= 2 ? toInt(args[1]) : 10;
    int m = args.length >= 3 ? toInt(args[2]) : 5;
    short* a = cast(short*)malloc(short.sizeof * n);
    foo(a, m);
}

It may be found (when not in -release mode) translating that code to something
like:

import std.c.stdlib: malloc, exit;
import std.c.stdio: fprintf, stderr;
import std.conv: toInt;
void foo(short* a, int m, size_t __alength) {
    if ((m * (*a).sizeof) >= __alength) {
        fprintf(stderr, "MemoryBoundsError(%d)\n", __LINE__  + 3);
        exit(1);
    }
    a[m] = 10;
}
void main(string[] args) {
    int n = args.length >= 2 ? toInt(args[1]) : 10;
    int m = args.length >= 3 ? toInt(args[2]) : 5;
    size_t __alength = short.sizeof * n;
    short* a = cast(short*)malloc(__alength);
    foo(a, m, __alength);
}

-----------------

All I have shown here is very primitive. Cyclone designers have given 10000
times more thinking time than me to such topics. So Cyclone is able to spot
similar and more bugs.
C# uses other useful tricks that I am currently looking at.

Bye,
bearophile
May 18 2009
next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
D is not going to catch memory safety problems that result from using C 
library functions, like malloc. D can only guarantee memory safety when 
using D code and D library functions.

The programmer is on his own using the unsafe C functions.
May 18 2009
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:

Sorry for raising this thread.
While C# has purposes somewhat different from D, I think C# designers are right
in their emphasys on safety. Modern programmers appreciate some safeties, and
modern languages give them. The ideas I am talking about are already
implemented in C#.
D can disable such safeties in release mode.

For example this C# code, compiled in release + unsafe mode shows that the
dotnet stops the execution almost as soon you write out of the allowed memory
zone. This uses stackalloc (similar to alloca) so they may be using a stack
canary to detect the out of bound condition at runtime:
http://en.wikipedia.org/wiki/Stack_buffer_overflow#Stack_canaries

using System;
public sealed unsafe class Test {
  static void Main(string[] args) {
    int n = args.Length > 0 ? Int32.Parse(args[0]) : 10;
    int* a = stackalloc int[n];
    for (int i = 0; i < n * 2; i++) {
      a[i] = i;
      Console.WriteLine("{0}", a[i]);
    }
  }
}


D is not going to catch memory safety problems that result from using C library
functions, like malloc. D can only guarantee memory safety when using D code
and D library functions. The programmer is on his own using the unsafe C
functions.<
When I port C code to D I'd like the D compiler help me catch some of the memory bugs that may be present in the translated C code. In C you have www.splint.org and valgrind, but the Java compiler shows how much good is to have a stricter compiler in the first place. And in D code you have array.ptr and std.gc.malloc too (and std.c.stdlib.alloca, that is a C function but has no equivalent to D, so I can think of it as part of D), such things may lead to bugs. Such things may be totally disallowed in "safe" D modules, but some safety may be added to unsafe D modules too. For example the memory std.gc.capacity() of Phobos1 can be used to detect out of bound situations with pointers given by std.gc.malloc. Bye, bearophile
May 20 2009