www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - easier way?

reply Jason Spencer <spencer8 sbcglobal.net> writes:
I have the following sample program:

---
import std.algorithm, std.stdio;

real mean(T: Elem[], Elem)(T data) {
   return reduce!("a + b")(0.0, data) / data.length;
}

real mean(T: Elem[][], Elem)(T data) {
   real partSum = 0;
   int count = 0;
   foreach (row; data)
   {
      partSum += reduce!("a + b")(0.0, row);
      count += row.length;
   }
   return partSum / cast(real) count;
}

void main() {
   real[] r1s = [1.8, 2.0, 2.2];
   real[][] r2s = [[1.8, 1.8, 1.8], [2.0, 2.0, 2.0], [2.2, 2.2,
2.2]];
   auto avg1 = mean(r1s);
   auto avg2a = mean!(real[][], real)(r2s);
   auto avg2b = mean(r2s);

   writefln("avg1 = %5.3f, avg2 = %5.3f", avg1, avg2a);
}
---

Compiling gives me:
C:\>bud mean.d
mean.d(23): Error: template mean.mean(T : Elem[],Elem) mean(T :
Elem[],Elem) matches more than one template declaration,
mean.d(3):mean(T : Elem[],Elem) and mean.d(7):mean(T :
Elem[][],Elem)

which is the line declaring avg2b.  I'd much rather not have to
specify the full template arguments and instead have the compiler
figure it out, but I also need the Elem type templated and I need
both 1-D and 2-D (and even 3-D) versions.  Is there a better way to
lay this out?

In general, I'm having a hard time with rectangular arrays.
Although they seem like a huge step up from C arrays, in practice, I
find myself back in pointer land more often than not.  For instance,
I can't/don't know how to:
 - get the count of elements in a n-D array w/o iterating over n-1
dimensions
 - get the size easily without pointer tricks that are still
strictly speaking not safe without iterating over n-1 dimensions
 - (as above) discriminate on array types passed 1-D, esp. at
compile time.  i.e. I can't make any template functions to work on
arrays of varying dimensions unless they're static arrays, which
doesn't really help.

Am I pretty normal, or am I missing a good tutorial somewhere?

Thanks,
Jason
Aug 25 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jason Spencer:

 Is there a better way to lay this out?

Is this good enough for you? import std.stdio: writefln; import std.algorithm: reduce; import std.traits: isArray; real mean(T)(T[] array) if (!isArray!T) { return reduce!q{a + b}(0.0, array) / array.length; } real mean(T)(T[][] mat) if (!isArray!T) { real partSum = 0.0; int count = 0; foreach (row; mat) { partSum += reduce!q{a + b}(0.0, row); count += row.length; } return partSum / count; } void main() { real[] r1s = [1.8, 2.0, 2.2]; auto avg1 = mean(r1s); writefln("avg1 = %5.3f", avg1); real[][] r2s = [[1.8, 1.8, 1.8], [2.0, 2.0, 2.0], [2.2, 2.2, 2.2]]; auto avg2 = mean(r2s); writefln("avg2 = %5.3f", avg2); }
 In general, I'm having a hard time with rectangular arrays.

In D there are no built-in rectangular arrays. There are fixed-sized arrays, that when have more than one coordinate are stored in a single contiguous zone of memory, and there are dynamic arrays, that may contain other dynamic arrays, but in general in such array of array rows may differ in length (your code here takes that into account).
 Although they seem like a huge step up from C arrays, in practice, I
 find myself back in pointer land more often than not.

This is not good in D.
  - get the count of elements in a n-D array w/o iterating over n-1
 dimensions

You can't, because there are no rectangular arrays in D. If you want to be sure they are rectangular you need to build your own matrix struct, that if you want contains a "alias this" :-(
  - get the size easily without pointer tricks that are still
 strictly speaking not safe without iterating over n-1 dimensions

I don't see how they help here. What kind of pointer tricks?
  - (as above) discriminate on array types passed 1-D, esp. at
 compile time.  i.e. I can't make any template functions to work on
 arrays of varying dimensions unless they're static arrays, which
 doesn't really help.

My example may help. Bye, bearophile
Aug 25 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
 In D there are no built-in rectangular arrays.

But probably a nD array struct will be added to Phobos in the following months. You may even write it yourself, for Phobos. As a less nice alternative, you may use a dynamic array of dynamic arrays and use the function preconditions (design by contract) to assert with a loop that the inputs matrices are indeed rectangular. Bye, bearophile
Aug 25 2010
prev sibling next sibling parent reply Jason Spencer <spencer8 sbcglobal.net> writes:
== Quote from bearophile (bearophileHUGS lycos.com)'s article
 Jason Spencer:
 Is there a better way to lay this out?


It's certainly better. I'll have to look at how this plays with some of my other usage, where dimensions come into play, but this is definitely a step forward for me.
  - get the size easily without pointer tricks that are still
 strictly speaking not safe without iterating over n-1 dimensions


Stupid stuff, that won't account for varying row sizes (which isn't anything I need anyway.) Things like (&array[$][$] - &array[0][0])/sizeof(array[0][0]), properly cast, etc.
  - (as above) discriminate on array types passed 1-D, esp. at
 compile time.  i.e. I can't make any template functions to work on
 arrays of varying dimensions unless they're static arrays, which
 doesn't really help.


Yeah, maybe. I guess the matrix thing will end up being what I really need. In that case, I'd probably template on the element type and the size in each dimension, where latter dimensions are allowed to be of 0 size. But to implement that struct/class, I still see it ending up as allocating one memory chunk and indexing manually via pointers. I can never cast a dynamic array to a static array, so I can never write templates for static types. All sizes will still be dynamic. So I'm not sure a class wrapper would buy me that much. Maybe I'm back to your static dispatch trick... Jason
Aug 25 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jason Spencer:

 Stupid stuff, that won't account for varying row sizes (which isn't
 anything I need anyway.)  Things like (&array[$][$] -
 &array[0][0])/sizeof(array[0][0]), properly cast, etc.

Be careful, you risk doing a mess :-) Dynamic arrays of the rows are not guaranteed to be contiguous, they have extra information at the end to support the append, there is the .ptr property that is better than using &, and be careful with sizeof(array) (that in D is a property) because it returns 2*size_t.sizeof for dynamic arrays if you aren't taking an item, or it returns the size in bytes for the whole array if it's a fixed-size one, so it's safer to use T.sizeof, where you know T is the element type.
 Yeah, maybe.  I guess the matrix thing will end up being what I really
 need.  In that case, I'd probably template on the element type and the
 size in each dimension, where latter dimensions are allowed to be of 0
 size.

Specifying the matrix sizes in the template is not useful, you obtain something very similar to a fixed sized array of fixed sized array. I suggest you to create a templated struct where the template arguments are the element type and the number of dimensions. Then you may overload many operators. Internally it allocates a single chunk of memory, so it contains the runtime size of all dimensions (so this struct supports dynamic reshaping too).
 I can never cast a dynamic array to a static array,

Is this OK for you? (this is not good code, and contains a runtime test, I don't remember if it's just an assert): void foo(int[5] arr) {} void main() { int[] sa = new int[5]; int[5] da = sa[0 .. 5]; foo(da); } Note that this code doesn't compile: void foo(int[5] arr) {} void main() { int[] sa = new int[5]; foo(sa[0 .. 5]); } Bye, bearophile
Aug 25 2010
parent reply Jason Spencer <spencer8 sbcglobal.net> writes:
== Quote from bearophile (bearophileHUGS lycos.com)'s article
 I suggest you to create a templated struct where the template

Do you mean just the number of dimensions, or do you mean the size along each dimension? Knowing just the # of dimensions won't tell me the total size or how to index. I need the size of each dimension. I'm starting to think a variadic template might be workable...
 I can never cast a dynamic array to a static array,

void foo(int[5] arr) {} void main() { int[] sa = new int[5]; int[5] da = sa[0 .. 5]; foo(da); }

That's a copy from heap to stack, isn't it? That's what I'm trying to avoid. Jason
Aug 25 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Jason Spencer:
 Do you mean just the number of dimensions, or do you mean the size
 along each dimension?

I mean just the number of dimensions.
  Knowing just the # of dimensions won't tell me
 the total size or how to index.  I need the size of each dimension.

If you need that information at compile-time then it's better to just use nD fixed-sized arrays. There is not much need to create an array struct for this purpose.
 That's a copy from heap to stack, isn't it?

Right. (and that code isn't fully syntactically correct, you need to use [] when you copy array contents).
 That's what I'm trying to avoid.<

D doesn't support fixed-sized arrays allocated on the heap, to do that you need to wrap them inside a struct (with no waste of extra memory): struct Foo(int N) { int[N] arr; } void main() { Foo!5* f = new Foo!(5)(); static assert((Foo!(5)).sizeof == 5 * int.sizeof); } Bye, bearophile
Aug 25 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Jason Spencer:
 Knowing just the # of dimensions won't tell me
 the total size or how to index.  I need the size of each dimension.

If you create such structs, you do what you want, so it represents a nD rectangular array. The total size is computed by the product of n runtime values stored inside the struct itself. And the indexing is not a problem, but you need n-1 run-time multiplications to find your index, unless you impose the constraint your sizes are a power of two, so you can replace the multiplications with shifts, that sometimes may give a small performance gain. Bye, bearophile
Aug 25 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:
 What you lose is the CT checking that can be done for multiplication or
 additions if all dimensions are exposed in the type.

If you want, you may create a second nD array struct where sizes too are CT values, plus two methods/free functions to convert between the two array types. Bye, bearophile
Aug 26 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:
 But in the same way that you cannot always convert T[] into T[3], how could
 you convert a Tensor!(int,2) (2D, runtime dims) into a Matrix!(int,4,4)?
 (2D, CT dim: 4*4)

The conversion functions are written by you, so you can all things you want (with just few runtime tests). If they are both on the heap you may even avoid copying data. Bye, bearophile
Aug 26 2010
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0016e6d976517abb3b048ebf4d05
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Aug 26, 2010 at 03:50, bearophile <bearophileHUGS lycos.com> wrote:

 Jason Spencer:
 Knowing just the # of dimensions won't tell me
 the total size or how to index.  I need the size of each dimension.

If you create such structs, you do what you want, so it represents a nD rectangular array. The total size is computed by the product of n runtime values stored inside the struct itself. And the indexing is not a problem, but you need n-1 run-time multiplications to find your index, unless you impose the constraint your sizes are a power of two, so you can replace the multiplications with shifts, that sometimes may give a small performance gain.

What bearophile means is something like this: struct Tensor(T, size_t nDim) { T[nDim] dimensions; T[] values; this(...) etc. } so Tensor!(int,3) is a 3D perfectly cubic 'matrix' if ints, with for example dimensions 3*5*8. The only thing CT-defined is the rank (scalar: 0, vector: 1, etc), the number of dimensions. The dimensions themselves are RT values. The total size is the product of these values. It can be calculated during construction. What you lose is the CT checking that can be done for multiplication or additions if all dimensions are exposed in the type. I don't know, would that be an possible addition to Phobos? I remember many people here creating matrix libraries, so someone probably made this already. Concerning arrays of arrays of arrays (or ranges), a useful template is one that gives the number of []'s, the rank: template rank(T) { static if (!isArray!T) // or (isInputRange!T) enum size_t rank = 0; else enum size_t rank = 1 + rank!(ElementType!T); // continue recursion } As you can see, it considers a simple scalar type to have rank 0. Maybe that's not what you want. This allows you to put simple template constraints, like if(rank!T == 2) ... I played with ranges of ranges a few month ago. Maybe some functions there could interest you: http://svn.dsource.org/projects/dranges/trunk/dranges/docs/rangeofranges.html http://www.dsource.org/projects/dranges/browser/trunk/dranges/rangeofranges.d Mapping a range of ranges, creating them by tensorial product, etc. But the docs are still incomplete, and the global module is due a clean-up, I'm afraid. Philippe --0016e6d976517abb3b048ebf4d05 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <br><br><div class=3D"gmail_quote">On Thu, Aug 26, 2010 at 03:50, bearophil= e <span dir=3D"ltr">&lt;<a href=3D"mailto:bearophileHUGS lycos.com">bearoph= ileHUGS lycos.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote= " style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, = 204); padding-left: 1ex;"> Jason Spencer:<br> <div class=3D"im">&gt; Knowing just the # of dimensions won&#39;t tell me<b= r> &gt; the total size or how to index. =A0I need the size of each dimension.<= br> <br> </div>If you create such structs, you do what you want, so it represents a = nD rectangular array. The total size is computed by the product of n runtim= e values stored inside the struct itself. And the indexing is not a problem= , but you need n-1 run-time multiplications to find your index, unless you = impose the constraint your sizes are a power of two, so you can replace the= multiplications with shifts, that sometimes may give a small performance g= ain.<br> </blockquote><div><br>What bearophile means is something like this:<br><br>= struct Tensor(T, size_t nDim)<br>{<br>=A0 =A0=A0 T[nDim] dimensions;<br>=A0= =A0=A0 T[] values;<br><br>=A0=A0=A0 this(...) etc.<br>}<br><br>so Tensor!(= int,3) is a 3D perfectly cubic &#39;matrix&#39; if ints, with for example d= imensions 3*5*8.<br> The only thing CT-defined is the rank (scalar: 0, vector: 1, etc), the numb= er of dimensions. The dimensions themselves are RT values. The total size i= s the product of these values. It can be calculated during construction.<br=

additions if all dimensions are exposed in the type.<br>I don&#39;t know, = would that be an possible addition to Phobos? I remember many people here c= reating matrix libraries, so someone probably made this already.<br> <br><br>Concerning arrays of arrays of arrays (or ranges), a useful templat= e is one that gives the number of []&#39;s, the rank:<br><br>template rank(= T)<br>{<br>=A0 static if (!isArray!T)=A0 // or (isInputRange!T)<br>=A0=A0= =A0 enum size_t rank =3D 0;<br> =A0 else<br>=A0=A0=A0 enum size_t rank =3D 1 + rank!(ElementType!T); // con= tinue recursion<br>}<br><br>As you can see, it considers a simple scalar ty= pe to have rank 0. Maybe that&#39;s not what you want.<br>This allows you t= o put simple template constraints, like=A0 if(rank!T =3D=3D 2) ...<br> <br>I played with ranges of ranges a few month ago. Maybe some functions th= ere could interest you:<br><br><a href=3D"http://svn.dsource.org/projects/d= ranges/trunk/dranges/docs/rangeofranges.html">http://svn.dsource.org/projec= ts/dranges/trunk/dranges/docs/rangeofranges.html</a><br> <a href=3D"http://www.dsource.org/projects/dranges/browser/trunk/dranges/ra= ngeofranges.d">http://www.dsource.org/projects/dranges/browser/trunk/drange= s/rangeofranges.d</a><br><br>Mapping a range of ranges, creating them by te= nsorial product, etc. But the docs are still incomplete, and the global mod= ule is due a clean-up, I&#39;m afraid.<br> <br><br>Philippe<br><br></div></div> --0016e6d976517abb3b048ebf4d05--
Aug 26 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0016368e327c9c645c048ec158de
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Aug 26, 2010 at 22:41, bearophile <bearophileHUGS lycos.com> wrote:

 Philippe Sigaud:
 What you lose is the CT checking that can be done for multiplication or
 additions if all dimensions are exposed in the type.

If you want, you may create a second nD array struct where sizes too are CT values, plus two methods/free functions to convert between the two array types.

you convert a Tensor!(int,2) (2D, runtime dims) into a Matrix!(int,4,4)? (2D, CT dim: 4*4) Philippe --0016368e327c9c645c048ec158de Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <br><br><div class=3D"gmail_quote">On Thu, Aug 26, 2010 at 22:41, bearophil= e <span dir=3D"ltr">&lt;<a href=3D"mailto:bearophileHUGS lycos.com">bearoph= ileHUGS lycos.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote= " style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, = 204); padding-left: 1ex;"> Philippe Sigaud:<br> <div class=3D"im">&gt; What you lose is the CT checking that can be done fo= r multiplication or<br> &gt; additions if all dimensions are exposed in the type.<br> <br> </div>If you want, you may create a second nD array struct where sizes too = are CT values, plus two methods/free functions to convert between the two a= rray types.<br><br></blockquote><div><br>But in the same way that you canno= t always convert T[] into T[3], how could you convert a Tensor!(int,2) (2D,= runtime dims) into a Matrix!(int,4,4)? (2D, CT dim: 4*4)<br> <br><br>Philippe<br><br><br>=A0<br></div></div> --0016368e327c9c645c048ec158de--
Aug 26 2010