www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Fixed size multidimensional array at runtime

reply "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
I know multidimensional arrays has been brought up many times, 
although I was not able to find a clear answer to my question. My 
knowledge of what's going on behind the curtains is somewhat 
lacking, please correct me if my assumptions are incorrect.

Creating a dynamic multidimensional array can be easily achieved 
with for example "auto matrix = new int[][](4, 2);", although if 
I've understood it correct this would create a "jagged" array (as 
explained on page 112 in TDPL) which may cause efficiency issues 
due to two indirections as opposed to only one indirection which 
you would have in a "rectangular" array (as explained at 
http://dlang.org/arrays.html). If you at compile time know the 
dimensions of the array you could write "int[2][4] matrix;", and 
I've understood this as creating a "rectangular" array.

In my case I don't know the dimensions at compile time, but I'm 
still interested in creating a multidimensional array with only 
one indirection (i.e. allocated contiguously in memory) at 
runtime, where I'm not going to modify the size of the array. Is 
this impossible* in D?
*I know I could create a one-dimensional array and 
programmatically convert from multiple dimensions to one 
dimension, yet this is not as satisfactory as a "true" 
multidimensional array.

Obviously it's the efficiency I worry about, I would much 
appreciate if someone could shed light upon this.
Jun 30 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, June 30, 2012 20:21:57 Vidar Wahlberg wrote:
 I know multidimensional arrays has been brought up many times,
 although I was not able to find a clear answer to my question. My
 knowledge of what's going on behind the curtains is somewhat
 lacking, please correct me if my assumptions are incorrect.
 
 Creating a dynamic multidimensional array can be easily achieved
 with for example "auto matrix = new int[][](4, 2);", although if
 I've understood it correct this would create a "jagged" array (as
 explained on page 112 in TDPL) which may cause efficiency issues
 due to two indirections as opposed to only one indirection which
 you would have in a "rectangular" array (as explained at
 http://dlang.org/arrays.html). If you at compile time know the
 dimensions of the array you could write "int[2][4] matrix;", and
 I've understood this as creating a "rectangular" array.
 
 In my case I don't know the dimensions at compile time, but I'm
 still interested in creating a multidimensional array with only
 one indirection (i.e. allocated contiguously in memory) at
 runtime, where I'm not going to modify the size of the array. Is
 this impossible* in D?
 *I know I could create a one-dimensional array and
 programmatically convert from multiple dimensions to one
 dimension, yet this is not as satisfactory as a "true"
 multidimensional array.
 
 Obviously it's the efficiency I worry about, I would much
 appreciate if someone could shed light upon this.

In D, static arrays are always fixed in size, and that size must be known at compile time, whereas dynamic arrays are never fixed in size (unless they're immutable), and the size doesn't need to be known at compile time. There is no way to have a static array whose size isn't known at compile time, which is what you'd need for what you're trying to do. I believe that the only way to do it would be to create a struct which wrapped a single dimensional, dynamic array, and then overloaded opIndex appropriately so that you could index it as it were multi-dimensional, which aside from the fact that the array _could_ be resized (though presumably, you wouldn't make that possible through your struct's API), that's exactly what a rectangular array implementation would be doing anyway. - Jonathan M Davis
Jun 30 2012
prev sibling next sibling parent "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
On Saturday, 30 June 2012 at 18:32:09 UTC, Jonathan M Davis wrote:
 In D, static arrays are always fixed in size, and that size 
 must be known at
 compile time, whereas dynamic arrays are never fixed in size 
 (unless they're
 immutable), and the size doesn't need to be known at compile 
 time. There is no
 way to have a static array whose size isn't known at compile 
 time, which is
 what you'd need for what you're trying to do.

Thanks for the answer. It makes sense, yet when used to multidimensional arrays as implemented in Java it takes some time to wrap your head around this.
 I believe that the only way to do it would be to create a 
 struct which wrapped
 a single dimensional, dynamic array, and then overloaded 
 opIndex appropriately

This is a very good suggestion, I hadn't thought of this possibility, this way I can get my beloved "matrix[x][y];" instead of something like "matrix.get(x, y);".
Jun 30 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, June 30, 2012 21:01:02 Vidar Wahlberg wrote:
 On Saturday, 30 June 2012 at 18:32:09 UTC, Jonathan M Davis wrote:
 In D, static arrays are always fixed in size, and that size
 must be known at
 compile time, whereas dynamic arrays are never fixed in size
 (unless they're
 immutable), and the size doesn't need to be known at compile
 time. There is no
 way to have a static array whose size isn't known at compile
 time, which is
 what you'd need for what you're trying to do.

Thanks for the answer. It makes sense, yet when used to multidimensional arrays as implemented in Java it takes some time to wrap your head around this.
 I believe that the only way to do it would be to create a
 struct which wrapped
 a single dimensional, dynamic array, and then overloaded
 opIndex appropriately

This is a very good suggestion, I hadn't thought of this possibility, this way I can get my beloved "matrix[x][y];" instead of something like "matrix.get(x, y);".

It might have to be matrix[x, y] though, since while you can make opIndex take as many arguments as you want, I don't believe that it will let you split them out into multiple indexing operations like in matrix[x][y]. If you really want that, you'd have to create a helper type which opIndex returned and have that helper type overload opIndex for the second index. But if efficiency is your main concern, that might be a problem, since I expect that that adds some overhead (though probably not a lot). - Jonathan M Davis
Jun 30 2012
prev sibling next sibling parent "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
On Saturday, 30 June 2012 at 19:06:31 UTC, Jonathan M Davis wrote:
 On Saturday, June 30, 2012 21:01:02 Vidar Wahlberg wrote:
 This is a very good suggestion, I hadn't thought of this
 possibility, this way I can get my beloved "matrix[x][y];"
 instead of something like "matrix.get(x, y);".

It might have to be matrix[x, y] though, since while you can make opIndex take as many arguments as you want, I don't believe that it will let you split them out into multiple indexing operations like in matrix[x][y].

Doh, you are of course correct. That's slightly unfortunate, then I'm actually leaning a bit more towards creating a "get(x, y)" method as a "[x, y]" construct would be a bit unusual (at least for me, for the time being).
Jun 30 2012
prev sibling next sibling parent reply Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
30.06.2012 22:21, Vidar Wahlberg пишет:
 I know multidimensional arrays has been brought up many times, although
 I was not able to find a clear answer to my question. My knowledge of
 what's going on behind the curtains is somewhat lacking, please correct
 me if my assumptions are incorrect.

 Creating a dynamic multidimensional array can be easily achieved with
 for example "auto matrix = new int[][](4, 2);", although if I've
 understood it correct this would create a "jagged" array (as explained
 on page 112 in TDPL) which may cause efficiency issues due to two
 indirections as opposed to only one indirection which you would have in
 a "rectangular" array (as explained at http://dlang.org/arrays.html). If
 you at compile time know the dimensions of the array you could write
 "int[2][4] matrix;", and I've understood this as creating a
 "rectangular" array.

 In my case I don't know the dimensions at compile time, but I'm still
 interested in creating a multidimensional array with only one
 indirection (i.e. allocated contiguously in memory) at runtime, where
 I'm not going to modify the size of the array. Is this impossible* in D?
 *I know I could create a one-dimensional array and programmatically
 convert from multiple dimensions to one dimension, yet this is not as
 satisfactory as a "true" multidimensional array.

 Obviously it's the efficiency I worry about, I would much appreciate if
 someone could shed light upon this.

You could be interested in my answer on this thread: http://forum.dlang.org/thread/mailman.1578.1339962782.24740.digitalmars-d-learn puremagic.com But looks like nobody really need such implementation (nobody asked me to make it up-to-date or put under VCS). -- Денис В. Шеломовский Denis V. Shelomovskij
Jun 30 2012
next sibling parent reply Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
01.07.2012 0:06, Vidar Wahlberg пишет:
 On Saturday, 30 June 2012 at 19:35:33 UTC, Denis Shelomovskij wrote:
 Thanks for the tip, that is interesting (I'm surprised I didn't come
 across this post when searching for this issue earlier). Although it
 seems to me that you still end up with "matrix[x, y, z]" instead of
 "matrix[x][y][z]", so it will only solve one half of the problem :)

I'm curious why do you need such syntax? `matrix[x][y][z]` expression means that `matrix[x]` is also a valid expression but it shouldn't. Slicing can be done using something like my implementation: `matrix[x, R[], R[]][y, R[]][z]` where `matrix[x, R[], R[]]` is obviously a valid slice. So I deliberately disallow rule "`matrix[x]` means `matrix[x, R[]...]`" and made `byTopDimension` property for such iteration: `matrix.byTopDimension[x].byTopDimension[y].byTopDimension[z]` See: http://deoma-cmd.ru/d/docs/src/my/rarray.html#byTopDimension -- Денис В. Шеломовский Denis V. Shelomovskij
Jul 01 2012
next sibling parent Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
01.07.2012 13:46, Denis Shelomovskij пишет:
 So I deliberately disallow rule
   "`matrix[x]` means `matrix[x, R[]...]`"
 and made `byTopDimension` property for such iteration:
 `matrix.byTopDimension[x].byTopDimension[y].byTopDimension[z]`

 See:
 http://deoma-cmd.ru/d/docs/src/my/rarray.html#byTopDimension

And yes, addition of `alias byTopDimension opIndex;` in my `RectangularArray` struct should allow your (inconsistent and error-prone, IMHO) syntax. -- Денис В. Шеломовский Denis V. Shelomovskij
Jul 01 2012
prev sibling parent Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
01.07.2012 16:02, Vidar Wahlberg пишет:
 On Sunday, 1 July 2012 at 09:46:52 UTC, Denis Shelomovskij wrote:
 I'm curious why do you need such syntax? `matrix[x][y][z]` expression
 means that `matrix[x]` is also a valid expression but it shouldn't.

It's not a syntax I need, it's a syntax I desire as that's the syntax I'm used to for multidimensional arrays.

No, that is the syntax you used for arrays of arrays. -- Денис В. Шеломовский Denis V. Shelomovskij
Jul 01 2012
prev sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 06/30/12 22:06, Vidar Wahlberg wrote:
 Although it seems to me that you still end up with "matrix[x, y, z]" instead
of "matrix[x][y][z]", so it will only solve one half of the problem :)
 For this particular case I'll just do the conversion from two-dimensional to
one-dimensional array programmatically and use a "get(x, y)"-method,

"matrix[x,y,z]" is a problem, yet "matrix.get(x,y,z)" is fine? Anyway, converting one syntax to the other is trivial - see example below. You also get the "matrix[x,y][z]" and "matrix[x][y,z]" syntax as a bonus. All of the neatIdx glue gets optimized away, so in this case there's zero extra overhead - but this is not something I'd count on in general. It actually surprised me how well gcc manages to handle it; even in the variable-indexes case the compiler only emits a few adds and shifts. artur import std.stdio; template IDS(A...) { alias A IDS; } static struct NeatIndex(A, uint N=0) { import std.traits; A* a; ParameterTypeTuple!(A.opIndex)[0..$-1] idxs; auto ref opIndex(IDXS...)(IDXS args) if (N+args.length==idxs.length+1) { return (*a)[idxs[0..$-args.length+1], args]; } auto ref opIndex(IDXS...)(IDXS args) if (N+args.length<idxs.length+1) { idxs[N..N+args.length] = args; return *cast(NeatIndex!(A, N+args.length)*)&this; } } NeatIndex!A neatIdx(A)(ref A a) { NeatIndex!A na = {&a}; return na; } void main() { static struct A { int[3][3][3] data; auto ref opIndex(size_t x, size_t y, size_t z) { return data[x][y][z]; } } A a; foreach (z; 0..3) foreach(y; 0..3) foreach(x; 0..3) a[z,y,x] = 100*z+10*y+x; static assert(!__traits(compiles, a[2][1][0])); auto b = neatIdx(a); writeln(b[2,1,0]); writeln(b[2,1][0]); writeln(b[2][1,0]); writeln(b[2][1][0]); writeln(neatIdx(a)[2][1][0]); }
Jul 01 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, June 30, 2012 21:27:15 Vidar Wahlberg wrote:
 On Saturday, 30 June 2012 at 19:06:31 UTC, Jonathan M Davis wrote:
 On Saturday, June 30, 2012 21:01:02 Vidar Wahlberg wrote:
 This is a very good suggestion, I hadn't thought of this
 possibility, this way I can get my beloved "matrix[x][y];"
 instead of something like "matrix.get(x, y);".

It might have to be matrix[x, y] though, since while you can make opIndex take as many arguments as you want, I don't believe that it will let you split them out into multiple indexing operations like in matrix[x][y].

Doh, you are of course correct. That's slightly unfortunate, then I'm actually leaning a bit more towards creating a "get(x, y)" method as a "[x, y]" construct would be a bit unusual (at least for me, for the time being).

I think that there are languages which actually use [x, y] for rectangular arrays, but I haven't used one that does that either. - Jonathan M Davis
Jun 30 2012
prev sibling next sibling parent "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
On Saturday, 30 June 2012 at 19:35:33 UTC, Denis Shelomovskij 
wrote:
 You could be interested in my answer on this thread:
 http://forum.dlang.org/thread/mailman.1578.1339962782.24740.digitalmars-d-learn puremagic.com

Thanks for the tip, that is interesting (I'm surprised I didn't come across this post when searching for this issue earlier). Although it seems to me that you still end up with "matrix[x, y, z]" instead of "matrix[x][y][z]", so it will only solve one half of the problem :) For this particular case I'll just do the conversion from two-dimensional to one-dimensional array programmatically and use a "get(x, y)"-method, it's neither difficult nor will it make the code complicated.
Jun 30 2012
prev sibling next sibling parent "Roman D. Boiko" <rb d-coding.com> writes:
On Saturday, 30 June 2012 at 20:06:58 UTC, Vidar Wahlberg wrote:
 On Saturday, 30 June 2012 at 19:35:33 UTC, Denis Shelomovskij 
 wrote:
 You could be interested in my answer on this thread:
 http://forum.dlang.org/thread/mailman.1578.1339962782.24740.digitalmars-d-learn puremagic.com

Thanks for the tip, that is interesting (I'm surprised I didn't come across this post when searching for this issue earlier). Although it seems to me that you still end up with "matrix[x, y, z]" instead of "matrix[x][y][z]", so it will only solve one half of the problem :) For this particular case I'll just do the conversion from two-dimensional to one-dimensional array programmatically and use a "get(x, y)"-method, it's neither difficult nor will it make the code complicated.

To have syntax m[x][y] you can create a range representing a row that knows its parent range + start offset (equal to x * row.length) and return it from m[x]. This way if m and m[x] are both stored on stack (or in the same cache block) you will not have to pay for additional indirection: to resolve m[x][y] just calculate an index as a sum of start offset and y. You may alternatively simply return a row, but the former approach easily generalizes for slicing by column first (in that case you would need to pick up appropriate syntax, probably a method call).
Jun 30 2012
prev sibling next sibling parent "Roman D. Boiko" <rb d-coding.com> writes:
On Sunday, 1 July 2012 at 09:52:22 UTC, Denis Shelomovskij wrote:
 01.07.2012 13:46, Denis Shelomovskij пишет:
 So I deliberately disallow rule
  "`matrix[x]` means `matrix[x, R[]...]`"
 and made `byTopDimension` property for such iteration:
 `matrix.byTopDimension[x].byTopDimension[y].byTopDimension[z]`

 See:
 http://deoma-cmd.ru/d/docs/src/my/rarray.html#byTopDimension

And yes, addition of `alias byTopDimension opIndex;` in my `RectangularArray` struct should allow your (inconsistent and error-prone, IMHO) syntax.

Just a remark: I like this approach more than the one I provided. Mine is simpler, but this one seems to be more robust and generic.
Jul 01 2012
prev sibling next sibling parent "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
On Sunday, 1 July 2012 at 09:46:52 UTC, Denis Shelomovskij wrote:
 I'm curious why do you need such syntax? `matrix[x][y][z]` 
 expression means that `matrix[x]` is also a valid expression 
 but it shouldn't.

It's not a syntax I need, it's a syntax I desire as that's the syntax I'm used to for multidimensional arrays.
Jul 01 2012
prev sibling next sibling parent "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
On Sunday, 1 July 2012 at 12:21:59 UTC, Denis Shelomovskij wrote:
 No, that is the syntax you used for arrays of arrays.

In D, yes. In other languages I'm familiar with, such as Java, that syntax is used for "rectangular" arrays. I've grown to like the syntax as used in Java and I wanted to know if it was possible to achieve it in D, Jonathan answered the question perfectly.
Jul 01 2012
prev sibling parent "Vidar Wahlberg" <vidar.wahlberg gmail.com> writes:
On Sunday, 1 July 2012 at 12:21:59 UTC, Denis Shelomovskij wrote:
 No, that is the syntax you used for arrays of arrays.

In D, yes. In other languages I'm familiar with, such as Java, that syntax is used for "rectangular" arrays. I've grown to like the syntax as used in Java and I wanted to know if it was possible to achieve it in D. On Sunday, 1 July 2012 at 13:31:12 UTC, Artur Skawina wrote:
 "matrix[x,y,z]" is a problem, yet "matrix.get(x,y,z)" is fine?

"[x, y, z]" is not a syntax I'm used to, whereas "get(x, y, z)" (or preferably "[x][y][z]") is. It really is just a matter of preference, maybe I can get used to it when I'm more experienced with D. I appreciate all the answers and suggestions, my question was already answered very well by Jonathan, so I'm going to leave this discussion here.
Jul 01 2012