digitalmars.D.learn - easier way?

Jason Spencer (52/52) Aug 25 2010 I have the following sample program:

bearophile (32/44) Aug 25 2010 Is this good enough for you?

bearophile (4/5) Aug 25 2010 But probably a nD array struct will be added to Phobos in the following ...
Jason Spencer (17/28) Aug 25 2010 It's certainly better. I'll have to look at how this plays with some

bearophile (21/29) Aug 25 2010 Be careful, you risk doing a mess :-)

Jason Spencer (9/18) Aug 25 2010 arguments are the element type and the number of dimensions.

bearophile (14/20) Aug 25 2010 If you need that information at compile-time then it's better to just us...
bearophile (4/6) Aug 25 2010 If you create such structs, you do what you want, so it represents a nD ...

Philippe Sigaud (40/50) Aug 26 2010 What bearophile means is something like this:

bearophile (4/6) Aug 26 2010 If you want, you may create a second nD array struct where sizes too are...

Philippe Sigaud (5/11) Aug 26 2010 But in the same way that you cannot always convert T[] into T[3], how co...

bearophile (4/7) Aug 26 2010 The conversion functions are written by you, so you can all things you w...

Jason Spencer <spencer8 sbcglobal.net> writes:

I have the following sample program:

---
import std.algorithm, std.stdio;

real mean(T: Elem[], Elem)(T data) {
   return reduce!("a + b")(0.0, data) / data.length;
}

real mean(T: Elem[][], Elem)(T data) {
   real partSum = 0;
   int count = 0;
   foreach (row; data)
   {
      partSum += reduce!("a + b")(0.0, row);
      count += row.length;
   }
   return partSum / cast(real) count;
}

void main() {
   real[] r1s = [1.8, 2.0, 2.2];
   real[][] r2s = [[1.8, 1.8, 1.8], [2.0, 2.0, 2.0], [2.2, 2.2,
2.2]];
   auto avg1 = mean(r1s);
   auto avg2a = mean!(real[][], real)(r2s);
   auto avg2b = mean(r2s);

   writefln("avg1 = %5.3f, avg2 = %5.3f", avg1, avg2a);
}
---

Compiling gives me:
C:\>bud mean.d
mean.d(23): Error: template mean.mean(T : Elem[],Elem) mean(T :
Elem[],Elem) matches more than one template declaration,
mean.d(3):mean(T : Elem[],Elem) and mean.d(7):mean(T :
Elem[][],Elem)

which is the line declaring avg2b.  I'd much rather not have to
specify the full template arguments and instead have the compiler
figure it out, but I also need the Elem type templated and I need
both 1-D and 2-D (and even 3-D) versions.  Is there a better way to
lay this out?

In general, I'm having a hard time with rectangular arrays.
Although they seem like a huge step up from C arrays, in practice, I
find myself back in pointer land more often than not.  For instance,
I can't/don't know how to:
 - get the count of elements in a n-D array w/o iterating over n-1
dimensions
 - get the size easily without pointer tricks that are still
strictly speaking not safe without iterating over n-1 dimensions
 - (as above) discriminate on array types passed 1-D, esp. at
compile time.  i.e. I can't make any template functions to work on
arrays of varying dimensions unless they're static arrays, which
doesn't really help.

Am I pretty normal, or am I missing a good tutorial somewhere?

Thanks,
Jason

Aug 25 2010

bearophile <bearophileHUGS lycos.com> writes:

Jason Spencer:

 Is there a better way to lay this out?

Is this good enough for you?

import std.stdio: writefln;
import std.algorithm: reduce;
import std.traits: isArray;

real mean(T)(T[] array) if (!isArray!T) {
    return reduce!q{a + b}(0.0, array) / array.length;
}

real mean(T)(T[][] mat) if (!isArray!T) {
    real partSum = 0.0;
    int count = 0;
    foreach (row; mat) {
        partSum += reduce!q{a + b}(0.0, row);
        count += row.length;
    }
    return partSum / count;
}

void main() {
    real[] r1s = [1.8, 2.0, 2.2];
    auto avg1 = mean(r1s);
    writefln("avg1 = %5.3f", avg1);

    real[][] r2s = [[1.8, 1.8, 1.8], [2.0, 2.0, 2.0], [2.2, 2.2, 2.2]];
    auto avg2 = mean(r2s);
    writefln("avg2 = %5.3f", avg2);
}



 In general, I'm having a hard time with rectangular arrays.

In D there are no built-in rectangular arrays. There are fixed-sized arrays,
that when have more than one coordinate are stored in a single contiguous zone
of memory, and there are dynamic arrays, that may contain other dynamic arrays,
but in general in such array of array rows may differ in length (your code here
takes that into account).


 Although they seem like a huge step up from C arrays, in practice, I
 find myself back in pointer land more often than not.

This is not good in D.


  - get the count of elements in a n-D array w/o iterating over n-1
 dimensions

You can't, because there are no rectangular arrays in D. If you want to be sure
they are rectangular you need to build your own matrix struct, that if you want
contains a "alias this" :-(


  - get the size easily without pointer tricks that are still
 strictly speaking not safe without iterating over n-1 dimensions

I don't see how they help here. What kind of pointer tricks?


  - (as above) discriminate on array types passed 1-D, esp. at
 compile time.  i.e. I can't make any template functions to work on
 arrays of varying dimensions unless they're static arrays, which
 doesn't really help.

My example may help.

Bye,
bearophile

Aug 25 2010

bearophile <bearophileHUGS lycos.com> writes:

 In D there are no built-in rectangular arrays.

But probably a nD array struct will be added to Phobos in the following months.
You may even write it yourself, for Phobos.

As a less nice alternative, you may use a dynamic array of dynamic arrays and
use the function preconditions (design by contract) to assert with a loop that
the inputs matrices are indeed rectangular.

Bye,
bearophile

Aug 25 2010

Jason Spencer <spencer8 sbcglobal.net> writes:

== Quote from bearophile (bearophileHUGS lycos.com)'s article
 Jason Spencer:
 Is there a better way to lay this out?

 Is this good enough for you?

It's certainly better.  I'll have to look at how this plays with some
of my other usage, where dimensions come into play, but this is
definitely a step forward for me.

  - get the size easily without pointer tricks that are still
 strictly speaking not safe without iterating over n-1 dimensions

 I don't see how they help here. What kind of pointer tricks?

Stupid stuff, that won't account for varying row sizes (which isn't
anything I need anyway.)  Things like (&array[$][$] -
&array[0][0])/sizeof(array[0][0]), properly cast, etc.

  - (as above) discriminate on array types passed 1-D, esp. at
 compile time.  i.e. I can't make any template functions to work on
 arrays of varying dimensions unless they're static arrays, which
 doesn't really help.

 My example may help.

Yeah, maybe.  I guess the matrix thing will end up being what I really
need.  In that case, I'd probably template on the element type and the
size in each dimension, where latter dimensions are allowed to be of 0
size.  But to implement that struct/class, I still see it ending up as
allocating one memory chunk and indexing manually via pointers.  I can
never cast a dynamic array to a static array, so I can never write
templates for static types.  All sizes will still be dynamic.  So I'm
not sure a class wrapper would buy me that much.  Maybe I'm back to
your static dispatch trick...

Jason

Aug 25 2010

bearophile <bearophileHUGS lycos.com> writes:

Jason Spencer:

 Stupid stuff, that won't account for varying row sizes (which isn't
 anything I need anyway.)  Things like (&array[$][$] -
 &array[0][0])/sizeof(array[0][0]), properly cast, etc.

Be careful, you risk doing a mess :-)
Dynamic arrays of the rows are not guaranteed to be contiguous, they have extra
information at the end to support the append, there is the .ptr property that
is better than using &, and be careful with sizeof(array) (that in D is a
property) because it returns 2*size_t.sizeof for dynamic arrays if you aren't
taking an item, or it returns the size in bytes for the whole array if it's a
fixed-size one, so it's safer to use T.sizeof, where you know T is the element
type.



 Yeah, maybe.  I guess the matrix thing will end up being what I really
 need.  In that case, I'd probably template on the element type and the
 size in each dimension, where latter dimensions are allowed to be of 0
 size.

Specifying the matrix sizes in the template is not useful, you obtain something
very similar to a fixed sized array of fixed sized array.
I suggest you to create a templated struct where the template arguments are the
element type and the number of dimensions.
Then you may overload many operators. Internally it allocates a single chunk of
memory, so it contains the runtime size of all dimensions (so this struct
supports dynamic reshaping too).


 I can never cast a dynamic array to a static array,

Is this OK for you? (this is not good code, and contains a runtime test, I
don't remember if it's just an assert):

void foo(int[5] arr) {}
void main() {
    int[] sa = new int[5];
    int[5] da = sa[0 .. 5];
    foo(da);
}


Note that this code doesn't compile:

void foo(int[5] arr) {}
void main() {
    int[] sa = new int[5];
    foo(sa[0 .. 5]);
}

Bye,
bearophile

Aug 25 2010

Jason Spencer <spencer8 sbcglobal.net> writes:

== Quote from bearophile (bearophileHUGS lycos.com)'s article
 I suggest you to create a templated struct where the template

arguments are the element type and the number of dimensions.

Do you mean just the number of dimensions, or do you mean the size

the total size or how to index.  I need the size of each dimension.

I'm starting to think a variadic template might be workable...

 I can never cast a dynamic array to a static array,

 Is this OK for you?
 void foo(int[5] arr) {}
 void main() {
     int[] sa = new int[5];
     int[5] da = sa[0 .. 5];
     foo(da);
 }

That's a copy from heap to stack, isn't it?  That's what I'm trying to
avoid.

Jason

Aug 25 2010

bearophile <bearophileHUGS lycos.com> writes:

Jason Spencer:
 Do you mean just the number of dimensions, or do you mean the size
 along each dimension?

I mean just the number of dimensions.



 the total size or how to index.  I need the size of each dimension.

If you need that information at compile-time then it's better to just use nD
fixed-sized arrays. There is not much need to create an array struct for this
purpose.


 That's a copy from heap to stack, isn't it?

Right. (and that code isn't fully syntactically correct, you need to use []
when you copy array contents).


 That's what I'm trying to avoid.<

D doesn't support fixed-sized arrays allocated on the heap, to do that you need
to wrap them inside a struct (with no waste of extra memory):

struct Foo(int N) {
    int[N] arr;
}
void main() {
    Foo!5* f = new Foo!(5)();
    static assert((Foo!(5)).sizeof == 5 * int.sizeof);
}

Bye,
bearophile

Aug 25 2010

bearophile <bearophileHUGS lycos.com> writes:

Jason Spencer:

 the total size or how to index.  I need the size of each dimension.

If you create such structs, you do what you want, so it represents a nD
rectangular array. The total size is computed by the product of n runtime
values stored inside the struct itself. And the indexing is not a problem, but
you need n-1 run-time multiplications to find your index, unless you impose the
constraint your sizes are a power of two, so you can replace the
multiplications with shifts, that sometimes may give a small performance gain.

Bye,
bearophile

Aug 25 2010

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Thu, Aug 26, 2010 at 03:50, bearophile <bearophileHUGS lycos.com> wrote:

 Jason Spencer:

 the total size or how to index.  I need the size of each dimension.

 If you create such structs, you do what you want, so it represents a nD
 rectangular array. The total size is computed by the product of n runtime
 values stored inside the struct itself. And the indexing is not a problem,
 but you need n-1 run-time multiplications to find your index, unless you
 impose the constraint your sizes are a power of two, so you can replace the
 multiplications with shifts, that sometimes may give a small performance
 gain.

What bearophile means is something like this:

struct Tensor(T, size_t nDim)
{
     T[nDim] dimensions;
     T[] values;

    this(...) etc.
}

so Tensor!(int,3) is a 3D perfectly cubic 'matrix' if ints, with for example
dimensions 3*5*8.
The only thing CT-defined is the rank (scalar: 0, vector: 1, etc), the
number of dimensions. The dimensions themselves are RT values. The total
size is the product of these values. It can be calculated during
construction.

What you lose is the CT checking that can be done for multiplication or
additions if all dimensions are exposed in the type.
I don't know, would that be an possible addition to Phobos? I remember many
people here creating matrix libraries, so someone probably made this
already.


Concerning arrays of arrays of arrays (or ranges), a useful template is one
that gives the number of []'s, the rank:

template rank(T)
{
  static if (!isArray!T)  // or (isInputRange!T)
    enum size_t rank = 0;
  else
    enum size_t rank = 1 + rank!(ElementType!T); // continue recursion
}

As you can see, it considers a simple scalar type to have rank 0. Maybe
that's not what you want.
This allows you to put simple template constraints, like  if(rank!T == 2)
...

I played with ranges of ranges a few month ago. Maybe some functions there
could interest you:

http://svn.dsource.org/projects/dranges/trunk/dranges/docs/rangeofranges.html
http://www.dsource.org/projects/dranges/browser/trunk/dranges/rangeofranges.d

Mapping a range of ranges, creating them by tensorial product, etc. But the
docs are still incomplete, and the global module is due a clean-up, I'm
afraid.


Philippe

Aug 26 2010

bearophile <bearophileHUGS lycos.com> writes:

Philippe Sigaud:
 What you lose is the CT checking that can be done for multiplication or
 additions if all dimensions are exposed in the type.

If you want, you may create a second nD array struct where sizes too are CT
values, plus two methods/free functions to convert between the two array types.

Bye,
bearophile

Aug 26 2010

Philippe Sigaud <philippe.sigaud gmail.com> writes:

On Thu, Aug 26, 2010 at 22:41, bearophile <bearophileHUGS lycos.com> wrote:

 Philippe Sigaud:
 What you lose is the CT checking that can be done for multiplication or
 additions if all dimensions are exposed in the type.

 If you want, you may create a second nD array struct where sizes too are CT
 values, plus two methods/free functions to convert between the two array
 types.

But in the same way that you cannot always convert T[] into T[3], how could
you convert a Tensor!(int,2) (2D, runtime dims) into a Matrix!(int,4,4)?
(2D, CT dim: 4*4)


Philippe

Aug 26 2010

bearophile <bearophileHUGS lycos.com> writes:

Philippe Sigaud:
 But in the same way that you cannot always convert T[] into T[3], how could
 you convert a Tensor!(int,2) (2D, runtime dims) into a Matrix!(int,4,4)?
 (2D, CT dim: 4*4)

The conversion functions are written by you, so you can all things you want
(with just few runtime tests). If they are both on the heap you may even avoid
copying data.

Bye,
bearophile

Aug 26 2010

D Programming

C/C++ Programming

Other

digitalmars.D.learn - easier way?