D - Dynamic rectangular arrays?

Norbert Nemec (39/39) Apr 24 2004 Hi there,

Walter (9/46) Apr 24 2004 don't

Norbert Nemec (2/4) Apr 24 2004 I'm trying just this right now. Let's see what come from it.

J Anderson (4/12) Apr 24 2004 Could be something that goes into DTL...

Matthew (5/18) Apr 24 2004 I think that should be pretty straightforward. I have two efficient (spa...

Drew McCormack (3/8) Apr 28 2004 Did you try this Norbert? Any luck?

Norbert Nemec (4/13) Apr 28 2004 I have something working, but it needs some more cleanup before I can pu...

Drew McCormack (56/63) Apr 27 2004 As someone coming from scientific programming, where Fortran is still

Norbert Nemec (7/17) Apr 27 2004 I clearly vote against this: it adds to the runtime-overhead. Even if it

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Hi there,

As far as I can see, there is no way to have rectangular arrays if you don't
know their size at compile time? This definitely is a limitation! Of
course, there is some overhead, if you have to store the shape of an array
and calculate column offsets at run-time, but it should still be much more
efficient than to use arrays of pointers like in C.

My proposal, how this could be achived would be an extension of array
pointers to something like:

struct array3d(T) {
        T *ptr;
        size_t range0;
        size_t stride0;
        size_t range1;
        size_t stride1;
        size_t range2;
        size_t stride2;
};

Here you would have as many rang/stride pair as the array has dimensions. If
the compiler knows stride to be 1, it can optimize it away, reducing
everything to the current 1d arrays.

The range arguments are only needed for range checking. Element access only
needs access to the ptr and the stride arguments.

Additional benefits:
* slicing possible not only to blocks but also to arbitrary sub-grids.
* simple implementation of fortran-arrays and even reverse memory layouts

Of course, all of this could be done in the library, but then - since we
already have arrays as language elements, it would make sense to make
truely multidimensional arrays part of the language as well.

As for the syntax: I would propose to use A[a,b] as way to express
rectangular multidimensional arrays and leave A[a][b] for ragged arrays as
we have them now. Of course, the compiler could still optimize ragged
arrays with fixed size, but that would be another matter.

Array pointers, of course have to have a dimension known at compile time. If
their range is dynamic, the type would, of course, be something like
        int[,]

(This will, of course, look ugly for 17-dimensional arrays, but who needs
these?)

Ciao,
Nobbi

Apr 24 2004

"Walter" <walter digitalmars.com> writes:

"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c6d8br$1un2$1 digitaldaemon.com...
 As far as I can see, there is no way to have rectangular arrays if you

don't
 know their size at compile time?

That's correct.

 This definitely is a limitation! Of
 course, there is some overhead, if you have to store the shape of an array
 and calculate column offsets at run-time, but it should still be much more
 efficient than to use arrays of pointers like in C.

 My proposal, how this could be achived would be an extension of array
 pointers to something like:

 struct array3d(T) {
         T *ptr;
         size_t range0;
         size_t stride0;
         size_t range1;
         size_t stride1;
         size_t range2;
         size_t stride2;
 };

 Here you would have as many rang/stride pair as the array has dimensions.

If
 the compiler knows stride to be 1, it can optimize it away, reducing
 everything to the current 1d arrays.

 The range arguments are only needed for range checking. Element access

only
 needs access to the ptr and the stride arguments.

 Additional benefits:
 * slicing possible not only to blocks but also to arbitrary sub-grids.
 * simple implementation of fortran-arrays and even reverse memory layouts

 Of course, all of this could be done in the library, but then - since we
 already have arrays as language elements, it would make sense to make
 truely multidimensional arrays part of the language as well.

 As for the syntax: I would propose to use A[a,b] as way to express
 rectangular multidimensional arrays and leave A[a][b] for ragged arrays as
 we have them now. Of course, the compiler could still optimize ragged
 arrays with fixed size, but that would be another matter.

 Array pointers, of course have to have a dimension known at compile time.

If
 their range is dynamic, the type would, of course, be something like
         int[,]

 (This will, of course, look ugly for 17-dimensional arrays, but who needs
 these?)

Your idea is good, and will work, but at the moment you can achieve nearly
the same thing by declaring a class with [] operator overloads.

Apr 24 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Walter wrote:
 Your idea is good, and will work, but at the moment you can achieve nearly
 the same thing by declaring a class with [] operator overloads.

I'm trying just this right now. Let's see what come from it.

Apr 24 2004

J Anderson <REMOVEanderson badmama.com.au> writes:

Norbert Nemec wrote:

Walter wrote:
  

Your idea is good, and will work, but at the moment you can achieve nearly
the same thing by declaring a class with [] operator overloads.
    

I'm trying just this right now. Let's see what come from it.
  

Could be something that goes into DTL...

-- 
-Anderson: http://badmama.com.au/~anderson/

Apr 24 2004

"Matthew" <matthew.hat stlsoft.dot.org> writes:

"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:c6ddcb$2blh$1 digitaldaemon.com...
 Norbert Nemec wrote:

Walter wrote:


Your idea is good, and will work, but at the moment you can achieve nearly
the same thing by declaring a class with [] operator overloads.

I'm trying just this right now. Let's see what come from it.

 Could be something that goes into DTL...

I think that should be pretty straightforward. I have two efficient
(space/speed)
and expressive rectangular array templates in STLSoft (http://stlsoft.org/) and
I
see no reason why I can't borrow them into DTL.

Apr 24 2004

Drew McCormack <drewmccormack mac.com> writes:

On 2004-04-24 11:56:50 +0200, Norbert Nemec <Norbert.Nemec gmx.de> said:

 Walter wrote:
 Your idea is good, and will work, but at the moment you can achieve nearly
 the same thing by declaring a class with [] operator overloads.

 
 I'm trying just this right now. Let's see what come from it.

Did you try this Norbert? Any luck?

Drew McCormack

Apr 28 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Drew McCormack wrote:

 On 2004-04-24 11:56:50 +0200, Norbert Nemec <Norbert.Nemec gmx.de> said:
 
 Walter wrote:
 Your idea is good, and will work, but at the moment you can achieve
 nearly the same thing by declaring a class with [] operator overloads.

 
 I'm trying just this right now. Let's see what come from it.

 
 Did you try this Norbert? Any luck?

I have something working, but it needs some more cleanup before I can put it
up for download with a clear conscience. Maybe I'll find the time for that
tonight, otherwise, it will have to wait for next week.

Apr 28 2004

Drew McCormack <drewmccormack mac.com> writes:

On 2004-04-24 10:31:48 +0200, Norbert Nemec <Norbert.Nemec gmx.de> said:

 Hi there,
 
 As far as I can see, there is no way to have rectangular arrays if you don't
 know their size at compile time? This definitely is a limitation! Of
 course, there is some overhead, if you have to store the shape of an array
 and calculate column offsets at run-time, but it should still be much more
 efficient than to use arrays of pointers like in C.

As someone coming from scientific programming, where Fortran is still 
king, I would like to chime in here. D is offering me hope that I may 
one day be able to leave Fortran behind, so I don't want you guys to 
mess it up at the last hurdle.

Fortran is a pretty rotten language, but if it gets one thing right, it 
is arrays. Fortran 90 that is. If D basically implements the Fortran 90 
approach, leaving the extremely primitive C approach largely behind, I 
think it could capture the imagination of a lot of scientific 
developers. And be a much more powerful language to boot.

Since I assume most people here are from the C/C++ side of programming, 
here is some stuff you can do with fortran 90 arrays (in D-type syntax)

1) Create them on the stack, with a size not known at compile time 
(called automatic arrays)

void someFunc( int n ) {
	float [n] array;
}

2) Create them on the heap with a size not known at compile time
	int n, m;
  	n = 5;
	m = 10;
	float [] array = new float[n][m];

3) Query the size of any dimension
	float [10][5] array;
	array.size(1); // this is equal to 10
	array.size(2); // this is equla to 5

4) Have lower and upper bounds
	float [2..10][2..4] array;
	array.lowerbound(1); // this is 2
	array.upperbound(2); // this is 4

5) Slice arbitrary dimensions
	float [10][5] array;
	array[2..8][1..4];  // Gives a new rank 2 array

6) Include strides in array slices. These can also be negative.
	float[10][5] array;
	array[2..4:2][1..4:3]; // This is rank 2 array with indexes { { [2,1], 
[2,4] } { [4,1],[4,4] } }
	array[4..2:-2][1..4:3]; // This is rank 2 array with { { [4,1], [4,4] 
} { [2,1],[2,4] } }

7) Some powerful functions, like reductions, matrix multiplication, and 
'broadcasting'

	float [10][5] matrix, resultmat;
	float [5] vector, resultvec;
	resultvec[] = matrixmult( matrix, vector );
	resultmat[][] = broadcast( vector, 1, 10 ); // broadcast expands 
vector into matrix by duplicating

Another way to do broadcasting is used in the python library Numpy: 
NewAxis. It is particularly elegant. It creates temporary arrays which 
are broadcasts of other arrays:

	matrixmultiply(  vector1[][newaxis], vector2[]  );
		// Turns vector into a temporay matrix, and multiplies by vector2

What I don't want, is to have to rely of my own array class and 
templates. That is just the C++ situation, and leads to all sorts of 
problems. Better to make your arrays powerful, and just put some more 
advanced functions (like matrixmultiply, for example) in the library.

Drew McCormack

Apr 27 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Drew McCormack wrote:
 4) Have lower and upper bounds
 float [2..10][2..4] array;
 array.lowerbound(1); // this is 2
 array.upperbound(2); // this is 4

I clearly vote against this: it adds to the runtime-overhead. Even if it
might be convenient in some cases, it never is necessary and may even make
the code harder to read.

 7) Some powerful functions, like reductions, matrix multiplication, and
 'broadcasting'

Here, the question is how much has to be in the language, and how much
should be put into the library instead.

 What I don't want, is to have to rely of my own array class and
 templates. That is just the C++ situation, and leads to all sorts of
 problems. Better to make your arrays powerful, and just put some more
 advanced functions (like matrixmultiply, for example) in the library.

Just my position. Arrays are just too basic to leave them to the library.

Apr 27 2004

D Programming

C/C++ Programming

Other

D - Dynamic rectangular arrays?