digitalmars.D.learn - Code to show

bearophile (71/72) Mar 05 2008 One of the most important topics of D I don't understand yet regards wha...

Marius Muja (78/78) Mar 05 2008 I'm using something very similar in my speed-critical code:

bearophile (7/8) Mar 05 2008 Thank you for showing. Your code is more refined (mine was an experiment...

Bill Baxter (5/19) Mar 05 2008 Sorry I didn't read your post in detail, but if think C-like behavior is...
Marius Muja (3/16) Mar 06 2008 I usually disable the GC in all speed-critical code and I manage memory

bearophile (5/8) Mar 05 2008 You may want to read my post :-)

Bill Baxter (6/13) Mar 05 2008 Ok, I just latched on to your phrase "Programming in C is much simpler"....

bearophile <bearophileHUGS lycos.com> writes:

One of the most important topics of D I don't understand yet regards what the
(Phobos) GC does when I try to allocate memory in more manual ways or in mixed
situations.
In this post I show some simple code, if you have some time I'd like you to
take a look at it and tell me what possibile problems/bugs (mostly related to
GC/memory) it may have (it's not fully tested yet, but it's short enough).
(Some days from now I'll try to post a second piece of code (regarding a Deque
implementation of mine) with a more complex GC/memory situation).


I have read the "Allocating Uninitialized Arrays On The Stack" part in the D
docs:
The uninitialized data that is on the stack will get scanned by the garbage
collector looking for any references to allocated memory. Since the
uninitialized data consists of old D stack frames, it is highly likely that
some of that garbage will look like references into the GC heap, and the GC
memory will not get freed. This problem really does happen, and can be pretty
frustrating to track down.<

That talks about the stack only, but is this problem possible for
Garbage-Collected memory on the heap too?

In a small bioinformatics benchmark program the time taken to allocate a 40 MB
matrix (about 3200^2 of int) was too much long for my tastes, so I have created
the following very experimental code:


import std.c.stdlib: cmalloc = malloc, cfree = free;
import std.gc: gcmalloc = malloc, gcrealloc = realloc, hasNoPointers;

T[] NewVoidGCArray(T, bool pointers=true)(int n) {
    auto pt = cast(T*)gcmalloc(n * T.sizeof);
    static if (!pointers)
        hasNoPointers(pt);
    return pt[0 .. n];
}

void delVoidGCArray(T)(ref T a) {
    gcrealloc(a.ptr, 0);
    a = null;
}

T[][] NewVoidGCMatrix(T, bool pointers=true)(int nr, int nc) {
    auto pt = cast(T*)gcmalloc(nr * nc * T.sizeof);
    static if (!pointers)
        hasNoPointers(pt);
    auto result = new T[][](nr);
    int pos;
    foreach(ref row; result) {
        row = pt[pos .. pos+nc];
        pos += nc;
    }
    return result;
}

void delVoidGCMatrix(T)(ref T[][] a) {
    gcrealloc(a.ptr, 0);
    a = null;
}

//-----------------------------------

T[] NewVoidCArray(T)(int n) {
    auto pt = cast(T*)cmalloc(n * T.sizeof);
    return pt[0 .. n];
}

void delVoidCArray(T)(ref T a) {
    cfree(a.ptr);
    a = null;
}

T[][] NewVoidCMatrix(T)(int nr, int nc) {
    auto pt = cast(T*)cmalloc(nr * nc * T.sizeof);
    auto result = new T[][](nr);
    int pos;
    foreach(ref row; result) {
        row = pt[pos .. pos+nc];
        pos += nc;
    }
    return result;
}

void deleteVoidCMatrix(T)(ref T[][] a) {
    cfree(a.ptr);
    a = null;
}


Notes:
- If you spot some wrong or dangerous things please tell me.
- I know most programs spend most time using data structures, and not just
creating them, so in most programs this can't speed up much the code.
- Even if this code is "safe" (and I am far from sure of that), it's longer
than the normal D code, so I am going to use it only in very special
situations, in speed-critical situations, and probably never in long-ish
programs that must be realiable enough.
- If I use such NewVoid[C|CG][Matrix|Array] I am going to fill them as soon as
possible anyway, because void data is a bomb anyway, and it requires care.
- If that T type is non-reference data (like int, float, etc) then
hasNoPointers() is useful. If T is an object, then the bool "pointers" must be
true (elsewhere I have written a recursive template that automatically tells if
a type is/contains a refence type or not, but I haven't used it here) to allow
the GC manage those object referces correctly, keeping them alive if I remove
their reference elsewhere, etc.
- I presume putting D objects in one of those arrays allocated from the C heap
(NewVoidCArray/ NewVoidCMatrix) is a too much dangerous thing to do, I don't
know if it's possible, but I think I'll always avoid to do it.
- The size of those array/matrices can be defined at run time. And their memory
is contigous, this means a 3200^2 matrix requires about less than about 41 MB,
instead of about 52 (because the sizes from the memory pool become rounded,
etc). Being contigous maybe there is a bit more cache coherence.
- Those arrays/matrices can't change size once created, and those matrices must
be rectangular, but I can often live with this. They can be used with the
normal D syntax, avoiding the slow down D has when you create a struct to wrap
such things, with the operator overloading to use [] etc.
- All those slices, like pt[0 .. n] don't actually copy the memory, so they are
fast. The matrix uses an array of arrays, this wastes some memory, but I can
live with that, and it allows me a standard syntax and to avoid structs/classes
(another possible solution is to use pointers, so I can use the C syntax, but I
lose the handy .length attribute, and the nice array bound cheeks when not in
-release mode).
- With one of this functions, I have written short code that's as fast as C
code, but half the lines of code (and it produces the correct result). Probably
I'll show you the interesting benchmark I was working on.

Once I'll slowly start to understand how the GC works, I'll be able to create
more complex data structures for D, like the Deque I am working on.

Bye and thank you for any help you can give,
bearophile

Mar 05 2008

Marius Muja <mariusm cs.ubc.ca> writes:

I'm using something very similar in my speed-critical code:


/**
  * Allocates (using C's malloc) a generic type T.
  *
  * Params:
  *     count = number of instances to allocate.
  * Returns: pointer (of type T*) to memory buffer
  */
public T* allocate(T)(int count = 1)
{
	T* mem = cast(T*) tango.stdc.stdlib.malloc(T.sizeof*count);
	return mem;
}

/**
  * Allocates (using C's malloc) a one dimensional array.
  * Params:
  *     count = array size
  * Returns: new array
  */
public T[] allocate(T : T[])(int count)
{
	T* mem = cast(T*) tango.stdc.stdlib.malloc(count*T.sizeof);		
	return mem[0..count];
}

/**
  * Allocates (using C's malloc) a two dimensional array.
  *
  * Params:
  *     rows = rows in the new array
  *     cols = cols in the new array, if cols==-1 then this function
  *     		allocates a one dimensional array of empty arrays
  * Returns: the new two dimensional array
  */
public T[][] allocate(T : T[][])(int rows, int cols = -1)
{
	if (cols == -1) {
		T[]* mem = cast(T[]*) tango.stdc.stdlib.malloc(rows*(T[]).sizeof);
		return mem[0..rows];
	}
	else {
		//if (rows & 1) rows++; // for 16 byte allignment	
		void* mem = 
tango.stdc.stdlib.malloc(rows*(T[]).sizeof+rows*cols*T.sizeof);
		T[]* index = cast(T[]*) mem;
		T* mat = cast(T*) (mem+rows*(T[]).sizeof);
		
		for (int i=0;i<rows;++i) {
			index[i] = mat[0..cols];
			mat += cols;
		}
		
		return index[0..rows];
	}
}

/**
  * Template to check is a type is an array.
  */
private template isArray(T)
{
     static if( is( T U : U[] ) )
         const isArray = true;
     else
         const isArray = false;
}

/**
  * Frees allocated memory.
  * Params:
  *     ptr = variable to free.
  */
public void free(T)(T ptr)
{
	static if ( isArray!(T) ) {
		tango.stdc.stdlib.free(cast(void*) ptr.ptr);
	}
	else {
		tango.stdc.stdlib.free(cast(void*) ptr);
	}
}

Mar 05 2008

bearophile <bearophileHUGS lycos.com> writes:

Marius Muja:
 I'm using something very similar in my speed-critical code:

Thank you for showing. Your code is more refined (mine was an experimental
version, not refined yet, little tested, dirty, etc) and it shows a couple
ideas I too may adopt/copy.

But beside such secondary improvements, my main question was: if the element of
those arrays  are object references, or pointers to GC structs, aren't such
arrays going to produce GC problems when they aren't initialized yet? They can
contain pointers that the GC may add to the roots... even if that memory block
is essentially empty still. 

So I presume I must always disable the GC looking for pointers on them, and
make the GC look for pointers only after I have filled the whole array (and it
contains reference types, but I presume the GC is smart enough to stop looking
at it if a block is filled with just non-reference types).

And note that such data structure is simple, it's just an 1D/2D array. When
data structures are more complex I don't know anymore how the GC will manage
them in such mixed manual/automatic situations... Programming in C is much
simpler, because there's no a hidden subystem (the GC) that works in mysterious
ways when your code is running.

Bye,
bearophile

Mar 05 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

bearophile wrote:
 Marius Muja:
 I'm using something very similar in my speed-critical code:

 
 Thank you for showing. Your code is more refined (mine was an experimental
version, not refined yet, little tested, dirty, etc) and it shows a couple
ideas I too may adopt/copy.
 
 But beside such secondary improvements, my main question was: if the element
of those arrays  are object references, or pointers to GC structs, aren't such
arrays going to produce GC problems when they aren't initialized yet? They can
contain pointers that the GC may add to the roots... even if that memory block
is essentially empty still. 
 
 So I presume I must always disable the GC looking for pointers on them, and
make the GC look for pointers only after I have filled the whole array (and it
contains reference types, but I presume the GC is smart enough to stop looking
at it if a block is filled with just non-reference types).

 And note that such data structure is simple, it's just an 1D/2D
 array. When data structures are more complex I don't know anymore how
 the GC will manage them in such mixed manual/automatic situations...
 Programming in C is much simpler, because there's no a hidden
 subystem (the GC) that works in mysterious ways when your code is
 running.

Sorry I didn't read your post in detail, but if think C-like behavior is 
easier, then why don't you just use 
std.c.stdlib.{malloc,calloc,alloca,realloc,free} ?

--bb

Mar 05 2008

Marius Muja <mariusm cs.ubc.ca> writes:

bearophile wrote:
 Marius Muja:
 I'm using something very similar in my speed-critical code:

 
 Thank you for showing. Your code is more refined (mine was an experimental
version, not refined yet, little tested, dirty, etc) and it shows a couple
ideas I too may adopt/copy.
 
 But beside such secondary improvements, my main question was: if the element
of those arrays  are object references, or pointers to GC structs, aren't such
arrays going to produce GC problems when they aren't initialized yet? They can
contain pointers that the GC may add to the roots... even if that memory block
is essentially empty still. 
 
 So I presume I must always disable the GC looking for pointers on them, and
make the GC look for pointers only after I have filled the whole array (and it
contains reference types, but I presume the GC is smart enough to stop looking
at it if a block is filled with just non-reference types).

I usually disable the GC in all speed-critical code and I manage memory 
manually.

 
 And note that such data structure is simple, it's just an 1D/2D array. When
data structures are more complex I don't know anymore how the GC will manage
them in such mixed manual/automatic situations... Programming in C is much
simpler, because there's no a hidden subystem (the GC) that works in mysterious
ways when your code is running.
 
 Bye,
 bearophile

Mar 06 2008

bearophile <bearophileHUGS lycos.com> writes:

Bill Baxter:
 Sorry I didn't read your post in detail, but if think C-like behavior is 
 easier, then why don't you just use 
 std.c.stdlib.{malloc,calloc,alloca,realloc,free} ?

You may want to read my post :-)
If you create a data structure you may want to mix GC-managed things with data
structures allocated manually. This leads to things I don't understand yet
(until someone explains me, or I just read the whole code of the Phobos GC).

Bye,
bearophile

Mar 05 2008

Bill Baxter <dnewsgroup billbaxter.com> writes:

bearophile wrote:
 Bill Baxter:
 Sorry I didn't read your post in detail, but if think C-like behavior is 
 easier, then why don't you just use 
 std.c.stdlib.{malloc,calloc,alloca,realloc,free} ?

 
 You may want to read my post :-)
 If you create a data structure you may want to mix GC-managed things with data
structures allocated manually. This leads to things I don't understand yet
(until someone explains me, or I just read the whole code of the Phobos GC).

Ok, I just latched on to your phrase "Programming in C is much simpler". 
    If C doesn't even support the kind of thing you want to do, then you 
can't really say it's simpler.  Simplistic maybe.  After all, anything 
you can do in C you can pretty much do with very similar code in D.

--bb

Mar 05 2008

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Code to show