digitalmars.D.learn - Fast temporary dynamic arrays? (And slicing of them)

Tom Kazimiers (108/108) Sep 05 2010 Hi all,

bearophile (13/21) Sep 05 2010 The appender.data() doesn't currently copy data.

bearophile (3/3) Sep 05 2010 My first test shows that it may work. But I have to grow the array backw...

Tom Kazimiers (8/13) Sep 07 2010 Hi,

bearophile (29/31) Sep 07 2010 Just created:

bearophile (3/4) Sep 07 2010 And you may add another runtime test to see in what direction the stack ...

Steven Schveighoffer (17/30) Sep 07 2010 Hm... you can do something like this (after upcoming release, appender h...

Tom Kazimiers (11/47) Sep 07 2010 Thanks for your clarification and examples, that made the whole array
bearophile (7/10) Sep 07 2010 Thank you for your answers. But I don't fully understand your answer. Do...

Steven Schveighoffer (9/22) Sep 07 2010 Yes.

bearophile (5/7) Sep 07 2010 I see, that's useful. I will write a Pimp-less version of it, then (when...

Tom Kazimiers <2voodoo gmx.de> writes:

Hi all,

so I have started to look at D and dug through the documentation, but
could not get a good answer on the following:

How can I have a (temporary) dynamic array on stack and make references
to it (no copying)? I successively put integers in an array (but don't
know how much there will be in advance) with an appender!(int[]) and get
the date out with appender.data(). Later on I pass the result to a
method as an "in int[]" parameter. Is that already a reference or will
it be copied? Are there better methods to accomplish this? The method
receiving such an array will not modifiy contents of the array, but only
read from it.

Thanks in advance,
Tom

--

P.s. To put a bit of context around that, here some notes on what I am
working (some questions here as well, but above is the primary one):

My first try-out project in D is an Object file reader (often used in
computer graphics). Part of that is a parsing method which parses a
single line that contains a definition of a face (a polygon). It could
for instance look like this:

f 1//10 2//10 3//10

That means it is a triangle face of points 1, 2 and 3 (three groups,
first number is point index). Furthermore no texture coordinate index
(no number between two slashes) and each with normal vector index 10.
But I don't want to go into detail of that. Say I want to parse that
line with D and in the end call a method to process that face with
references to lists of the different point indices:

void process_face(int index, int vcount, in int[] vertices,
                  in int[] texcoords = null, in int[] normals = null)
{
    ...
}

(I guess "in" means sth. like const reference)

The arrays should not be copied, but be references. The line parsing
method now has the following lines (line is a char[]):

//search face
int index = indexOf(line, "f ");
if(index != -1)
{
    line = line[index+2 .. $];  // slice away the "f " part
    fc++; // increment face count

    int slash;
    while(true)
    {
         slash = indexOf(line, " /");  // leading spaces + slashes?
         if(slash != -1)
             // remove space
             line = line[0 .. slash] ~ line[slash+1 .. $];
         else
             break;
    }
    while(true)
    {
        slash = indexOf(line, "/ ");  // trailing spaces + slashes?
            if(slash != -1)
                // remove space	
                line = line[0 .. slash+1] ~ line[slash+2 .. $];
            else
                break;
    }

    // dynamic vertex, texture and normal arrays
    auto vIndices = appender!(int[])();
    auto tIndices = appender!(int[])();
    auto nIndices = appender!(int[])();

    // some indices
    int vi,ti,ni;

    // split line on white spaces
    char[][] vertexCoords = split( line );
    // go through each part - are those splittings ok?
    foreach(char[] coord; vertexCoords) {
        vi = parse!(int)(coord); //get int from string
        vIndices.put( vi ); // save it in vertex array
        if (coord[0] == '/') { // follows a slash?
            coord = coord[1 ..$]; // get rid of it
            if (coord[0] == '/') { // follows another slash?
                coord = coord[1 ..$]; // get rid of it
                ni = parse!(int)( coord ); // git following int
                nIndices.put( ni ); // save it in normal array
            } else {
                ti = parse!(int)( coord );
                tIndices.put( ti );
                if (coord[0] == '/') {
                    coord = coord[1 ..$];
                    int ni = parse!(int)( coord );
                    nIndices.put( ni );
                }
            }
        }
    }

    // array references for passing to processing method
    int[] varray = null, tarray = null, narray = null;

    // if we have data, save it to appropriate varible
    if( !(vIndices.data().empty()) )
	    varray = vIndices.data();

    if( !(tIndices.data().empty()) )
	    tarray = tIndices.data();


    if( !(nIndices.data().empty()) )
	    narray = nIndices.data();

    // process it
    process_face(fc, vIndices.data().length, varray, tarray, narray);

    return;
}

I hope this rather lengthy explanation is no problem here (if looked on
it at all, since it was not my primary question :-) ). If you are in the
mood, please comment on how make parts on it better. It is pretty much
my first D code. Well, thanks.

Cheers
Tom

Sep 05 2010

bearophile <bearophileHUGS lycos.com> writes:

Tom Kazimiers:
 How can I have a (temporary) dynamic array on stack and make references
 to it (no copying)? I successively put integers in an array (but don't
 know how much there will be in advance) with an appender!(int[]) and get
 the date out with appender.data(). Later on I pass the result to a
 method as an "in int[]" parameter. Is that already a reference or will
 it be copied? Are there better methods to accomplish this? The method
 receiving such an array will not modifiy contents of the array, but only
 read from it.

The appender.data() doesn't currently copy data.

There is no standard way to put a growable array on the stack. Maybe you can
hack it with several successive calls to alloca(), but I have never tried it.

There many other solutions, like:
- Using a fixed-sized array on the stack, you keep its true length in a variable
- The same, with static array
- The same with a static gshared array
- pre-allocating a "large enough" dynamic array before all the calls to the
function that uses them
- using a single alloca() when you know how many items you have to append
- use a deque data structure that uses a manually-managed pool of blocks of
items, that you can concatenate in a linked list or index through a dynamic
array of pointers

The cuter solution is to simulate a realloc on the stack (to keep a single
growable array) with a series of calls to alloca. But I don't know if it works
:-) I will try it.

Bye,
bearophile

Sep 05 2010

bearophile <bearophileHUGS lycos.com> writes:

My first test shows that it may work. But I have to grow the array backwards,
and push back the array start, because that's how my stack grows (using alloca
to allocate geometrically bigger chunks). So unless you want to reverse the
items once the array is built, you have to change the algorithm that uses the
array a little.

Bye,
bearophile

Sep 05 2010

Tom Kazimiers <2voodoo gmx.de> writes:

Hi,

thanks for your tests.

On 09/06/2010 04:59 AM, bearophile wrote:
 My first test shows that it may work. But I have to grow the array
 backwards, and push back the array start, because that's how my stack
 grows (using alloca to allocate geometrically bigger chunks). So
 unless you want to reverse the items once the array is built, you
 have to change the algorithm that uses the array a little.

Unfortunately I need the elements to stay in the order I have added
them. But good to know that it would work with backward growing of the
array - do have you an example of that?

Cheers,
Tom

Sep 07 2010

bearophile <bearophileHUGS lycos.com> writes:

Tom Kazimiers:
 But good to know that it would work with backward growing of the
 array - do have you an example of that?

Just created:


import std.stdio: writeln;
import std.c.stdlib: alloca;

void main() {
    int n = 30;
    alias int T;
    enum int initialCapacity = 4;
    static assert(initialCapacity > 0);

    int len = 0;
    int capacity = initialCapacity;
    int* ptr = cast(int*)alloca(capacity * T.sizeof);
    ptr += initialCapacity - 1; // correct?

    foreach_reverse (i; 0 .. n) {
        if (i >= capacity) {
            alloca(capacity * T.sizeof);
            capacity *= 2;
        }
        ptr--;
        *ptr = i;
        len++;
    }

    writeln("len, capacity: ", len, " ", capacity);
    auto arr = ptr[0 .. len];
    writeln(arr);
}


Beware of stack overflows.

Bye,
bearophile

Sep 07 2010

bearophile <bearophileHUGS lycos.com> writes:

 Beware of stack overflows.

And you may add another runtime test to see in what direction the stack grows.

Bye,
bearophile

Sep 07 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 05 Sep 2010 22:41:50 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Tom Kazimiers:
 How can I have a (temporary) dynamic array on stack and make references
 to it (no copying)? I successively put integers in an array (but don't
 know how much there will be in advance) with an appender!(int[]) and get
 the date out with appender.data(). Later on I pass the result to a
 method as an "in int[]" parameter. Is that already a reference or will
 it be copied? Are there better methods to accomplish this? The method
 receiving such an array will not modifiy contents of the array, but only
 read from it.

 The appender.data() doesn't currently copy data.

 There is no standard way to put a growable array on the stack. Maybe you  
 can hack it with several successive calls to alloca(), but I have never  
 tried it.

Hm... you can do something like this (after upcoming release, appender has  
changed):

void foo()
{
    int[1024] buf;
    auto app = appender(buf[]);
    app.clear();
    ...
}

After app.clear, appender will fill up the static buffer until full, and  
then reallocate on the heap.

Note that the new appender uses heap data to store its implementation, so  
it's not as quick as it could be.  This is per Andrei's requirement that  
it be a reference type.

-Steve

Sep 07 2010

Tom Kazimiers <2voodoo gmx.de> writes:

Hi,

On 09/06/2010 04:55 AM, Jonathan M Davis wrote:
 Static arrays are value types, but dynamic arrays are reference types.

 [...]

 No array copying takes place anywhere in that program. If you want to
 copy an array, you'd do one of the following

 [...]

 Passing dynamic arrays to functions passes the reference.

Thanks for your clarification and examples, that made the whole array
handling clearer to me.

On 09/07/2010 02:56 PM, Steven Schveighoffer wrote:
 On Sun, 05 Sep 2010 22:41:50 -0400, bearophile wrote:
 Tom Kazimiers:
 How can I have a (temporary) dynamic array on stack and make
 references to it (no copying)? I successively put integers in an
 array (but don't know how much there will be in advance) with an
 appender!(int[]) and get the date out with appender.data(). Later
 on I pass the result to a method as an "in int[]" parameter. Is
 that already a reference or will it be copied? Are there better
 methods to accomplish this? The method receiving such an array will
 not modifiy contents of the array, but only read from it.

 The appender.data() doesn't currently copy data.

 There is no standard way to put a growable array on the stack. Maybe
 you can hack it with several successive calls to alloca(), but I have
 never tried it.


Ok, good to know. But for now I will stay away from optimizations with
alloca, etc. - its's harder to read the code then - premature
optimization is evil :-)

 Hm... you can do something like this (after upcoming release, appender
 has changed):

 void foo()
 {
    int[1024] buf;
    auto app = appender(buf[]);
    app.clear();
    ...
 }

 After app.clear, appender will fill up the static buffer until full,
 and then reallocate on the heap.

 Note that the new appender uses heap data to store its implementation,
 so it's not as quick as it could be.  This is per Andrei's requirement
 that it be a reference type.

That sound good - when will this upcoming release be?

Cheers,
Tom

Sep 07 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 Note that the new appender uses heap data to store its implementation, so  
 it's not as quick as it could be.  This is per Andrei's requirement that  
 it be a reference type.

Thank you for your answers. But I don't fully understand your answer. Do you
mean it uses the Pimpl idiom, and allocates the struct on the heap?

I use appender only when performance is important. The appender is a hack
useful because array appending in D is very slow (and even appender is quite
slow), so it must be first of all fast, otherwise it's not useful.

I generally use appender inside the scope of a single function.

So unless I am missing something I think Andrei requirement is/was wrong.

Bye,
bearophile

Sep 07 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 07 Sep 2010 12:54:52 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 Note that the new appender uses heap data to store its implementation,  
 so
 it's not as quick as it could be.  This is per Andrei's requirement that
 it be a reference type.

 Thank you for your answers. But I don't fully understand your answer. Do  
 you mean it uses the Pimpl idiom, and allocates the struct on the heap?

Yes.

 I use appender only when performance is important. The appender is a  
 hack useful because array appending in D is very slow (and even appender  
 is quite slow), so it must be first of all fast, otherwise it's not  
 useful.

Appending is as fast as possible, at the cost of an initial allocation.   
Without this, the capacity would have to be stored inside the array, or be  
aliased, which wouldn't work all that well.

 I generally use appender inside the scope of a single function.

 So unless I am missing something I think Andrei requirement is/was wrong.

An appender is an ouput range, so passing it into a function so the  
function can output to it is a requirement.

-Steve

Sep 07 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 An appender is an ouput range, so passing it into a function so the  
 function can output to it is a requirement.

I see, that's useful. I will write a Pimp-less version of it, then (when I
don't need a range, but just a local accumulator).

Thank you for your always gentle answers,
hugs,
bearophile

Sep 07 2010

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Fast temporary dynamic arrays? (And slicing of them)