digitalmars.D - Linear Algebra

mclysenk mtu.edu (31/31) Aug 02 2005 One year ago, I started learning D. The language is positively lovely, ...

Sean Kelly (15/42) Aug 02 2005 For what it's worth, this is a feature Walter seems to be very much inte...
Mike Capp (8/12) Aug 02 2005 True, but I'm wondering if this might be two distinct requests. Game and

mclysenk mtu.edu (30/41) Aug 02 2005 That is true, but even the slightest improvement would be better than mo...

Sean Kelly (5/10) Aug 03 2005 Why not keep the parallel foreach bit and leave optimizations as a QOI i...

mclysenk mtu.edu (55/65) Aug 03 2005 I agree optimization needs to be QOI, but parallel foreach loops are no

Sean Kelly (22/76) Aug 03 2005 Yup. This is covered in detail in "C++ Templates" by Vandevoorde and Jo...

=?iso-8859-1?q?Knud_S=F8rensen?= (7/7) Aug 02 2005 On Wed, 03 Aug 2005 01:23:14 +0000, mclysenk wrote:
Freejack (12/49) Aug 27 2005 When it comes to higher math, I've found that usually easier to use

Don Clugston (30/81) Sep 07 2005 I agree, but I think that if the proposed array operations were

mclysenk mtu.edu writes:

One year ago, I started learning D.  The language is positively lovely, with
handy builtin dynamic arrays and powerful templates.  And it's fast too.

But, there's one irksome flaw D has in common with every other programming
language I've encountered: no support for linear algebra. This can be
implemented in either a module in the standard library or a language extension.
Often my programs make use of vectors and matrices, since they typically involve
some sort of computer graphics.  While it is easy enough to write my own classes
to take care of this, I find it irksome that I need to reinvent the wheel so
many times.

Here are some reasons for a standard vector:
1.  Vectors are frequently used, especially in games, computer graphics, physics
simulations, numerical applications. In some cases, they may even turn up more
frequently than strings.

2.  Implementing an efficient vector is very sensitive to the specifics of the
system and the situation in which it is used.  On most modern machines, there is
some sort of SIMD hardware built specifically to handle these sorts of problems,
but taking advantage of it requires either handcrafted assembly or special
compiler intrinsics to assure the memory is properly aligned and so on.  The
other option is lazy evaluation, which can be very difficult to implement.
In practice, the best approach lies somewhere between the fully vectorized SIMD
approach and lazy evaluation. On an x86, it is best to use lazy evaluation to
break the vectors into chunks of 4 floats, then use SIMD operations to solve
each part.  Implementing this on a case by case basis is rather tedious and
usually reserved only for extremely critical performance bottlenecks.

3.  There are many redundant implementations of vectors floating around, all of
which contain essentially the same components.  A standardized implementation
would make it easier to write common objects like physics engines and graphics
wrappers that would all work together.


I've noticed that on the array page, there is a section on array operations.  If
properly implemented, these may be a good start.  However, it would still be
nice to have a library for higher mathematics as part of a standard library.

Aug 02 2005

Sean Kelly <sean f4.ca> writes:

In article <dcp6a2$45$1 digitaldaemon.com>, mclysenk mtu.edu says...
One year ago, I started learning D.  The language is positively lovely, with
handy builtin dynamic arrays and powerful templates.  And it's fast too.

But, there's one irksome flaw D has in common with every other programming
language I've encountered: no support for linear algebra. This can be
implemented in either a module in the standard library or a language extension.
Often my programs make use of vectors and matrices, since they typically involve
some sort of computer graphics.  While it is easy enough to write my own classes
to take care of this, I find it irksome that I need to reinvent the wheel so
many times.

For what it's worth, this is a feature Walter seems to be very much interested
in.  It will probably be a post-1.0 feature, but I think Walter may also be
waiting for a syntax/implementation scheme that is both powerful and elegant
(expression templates in C++ are quite powerful, but are a beast to implement).

Here are some reasons for a standard vector:
1.  Vectors are frequently used, especially in games, computer graphics, physics
simulations, numerical applications. In some cases, they may even turn up more
frequently than strings.

2.  Implementing an efficient vector is very sensitive to the specifics of the
system and the situation in which it is used.  On most modern machines, there is
some sort of SIMD hardware built specifically to handle these sorts of problems,
but taking advantage of it requires either handcrafted assembly or special
compiler intrinsics to assure the memory is properly aligned and so on.  The
other option is lazy evaluation, which can be very difficult to implement.
In practice, the best approach lies somewhere between the fully vectorized SIMD
approach and lazy evaluation. On an x86, it is best to use lazy evaluation to
break the vectors into chunks of 4 floats, then use SIMD operations to solve
each part.  Implementing this on a case by case basis is rather tedious and
usually reserved only for extremely critical performance bottlenecks.

However this is implemented, you can bet that the bulk of it will be implemented
via callbacks in the guts of Phobos.  So if you want to use specialized
operations you'll likely be able to, even if Walter doesn't provide this by
default.  And as far as Ares is concerned, this is a sufficiently important
feature that my first inclination would be to make it easily extensible without
such low-level modifications, so perhaps this will happen in Phobos as well
(assuming the two remain separate entities, that is).

I've noticed that on the array page, there is a section on array operations.  If
properly implemented, these may be a good start.  However, it would still be
nice to have a library for higher mathematics as part of a standard library.

If you have some good ideas, by all means propose them.  Walter tends to be
quite receptive to this sort of thing.


Sean

Aug 02 2005

Mike Capp <mike.capp gmail.com> writes:

In article <dcp6a2$45$1 digitaldaemon.com>, mclysenk mtu.edu says...
Here are some reasons for a standard vector:
1.  Vectors are frequently used, especially in games, computer graphics, physics
simulations, numerical applications. In some cases, they may even turn up more
frequently than strings.

True, but I'm wondering if this might be two distinct requests. Game and
graphics code, and a lot of physics code, deals with 2, 3 or 4-element vectors,
and usually vectors of single-precision floats at that. Writing tuned
implementations of those is a vastly simpler problem than generating
optimally-efficient vectors for arbitrary sizes and types.

cheers
Mike

Aug 02 2005

mclysenk mtu.edu writes:

In article <dcp9sp$2ug$1 digitaldaemon.com>, Mike Capp says...
In article <dcp6a2$45$1 digitaldaemon.com>, mclysenk mtu.edu says...
Here are some reasons for a standard vector:
1.  Vectors are frequently used, especially in games, computer graphics, physics
simulations, numerical applications. In some cases, they may even turn up more
frequently than strings.

True, but I'm wondering if this might be two distinct requests. Game and
graphics code, and a lot of physics code, deals with 2, 3 or 4-element vectors,
and usually vectors of single-precision floats at that. Writing tuned
implementations of those is a vastly simpler problem than generating
optimally-efficient vectors for arbitrary sizes and types.

That is true, but even the slightest improvement would be better than most math
implementations I've seen.  When I start prototyping some algorithm, I'll
quickly code up a vector class with just the basic operators to get started.
Just having a standard vector would save me a great deal of time, any
optimizations after the fact are just gravy. 

When it comes to optimization, there is a lot the compiler can do. Library level
solutions suffer due to several factors; lazy evaluation requires writing lots
of very complex and hacky templates + tricky compiler flags, SIMD instructions
require inline assembler and all its associated woes. The thing is the compiler
can basically handle both, more elegantly than a programmer could with a library
and inline assembly.

I like the idea of vectorization, but the problem is that the syntax they
present is rather cumbersome.  What it seems that they are asking for is a
parallel foreach, which is a good idea, but not exactly what I had in mind.  A
better syntax would be something that allows a single operation to work on a
whole list.  Something like the following would be nice,


vector float a[4] = [0, 0, 0, 1];
vector float b[4] = [1, 0, 0, 1];
vector float c[4] = [2, 3, 4, 5];
a *= 3.0;   //a == [0, 0, 0, 3]
a = a + b;  //a == [1, 0, 0, 4]
a = c * b;  //a == [2, 0, 0, 5]
a = b + 1.0;//a == [2, 1, 1, 2]
a = sin(b); //ILLEGAL, only arithemtic operations can be vectorized

//The same could work any arithmetic type
vector int x[3] = [3, 1, 2];
x <<= 1;   //x == [6, 2, 4]


The vector type qualifier would allow parallel arithmetic to be done on the
vector.

Aug 02 2005

Sean Kelly <sean f4.ca> writes:

In article <dcpj88$9k7$1 digitaldaemon.com>, mclysenk mtu.edu says...
I like the idea of vectorization, but the problem is that the syntax they
present is rather cumbersome.  What it seems that they are asking for is a
parallel foreach, which is a good idea, but not exactly what I had in mind.  A
better syntax would be something that allows a single operation to work on a
whole list.

Why not keep the parallel foreach bit and leave optimizations as a QOI issue?
Any array operation could be processed in blocks of 4, for example, provided the
underlying implementation were done right.


Sean

Aug 03 2005

mclysenk mtu.edu writes:

In article <dcqk1j$15h4$1 digitaldaemon.com>, Sean Kelly says...
In article <dcpj88$9k7$1 digitaldaemon.com>, mclysenk mtu.edu says...
I like the idea of vectorization, but the problem is that the syntax they
present is rather cumbersome.  What it seems that they are asking for is a
parallel foreach, which is a good idea, but not exactly what I had in mind.  A
better syntax would be something that allows a single operation to work on a
whole list.

Why not keep the parallel foreach bit and leave optimizations as a QOI issue?
Any array operation could be processed in blocks of 4, for example, provided the
underlying implementation were done right.


I agree optimization needs to be QOI, but parallel foreach loops are no
substitute for array operations. What we need is a standard syntax for handling
vector like objects. Examine the following vector class,

class Vector(T, int N)
{
public:

..

Vector opAdd(Vector v)
{
Vector r = new Vector();

for(size_t i=0; i<N; i++)
r[i] = x[i] + v[i];

return r;
}

Vector opMul(T s)
{
Vector r = new Vector();

for(size_t i=0; i<N; i++)
r[i] = x[i] * s;

return r;
}

..

private:
T[N] x;   //Data
}


The first thing to notice in this implementation all the repeated code in each
of the operators. Now consider this expression,

Vector!(float, 4) a, b, c;
..
a = b + c * 5.0;


Translates to:

a = b.opAdd(c.opMul(5.0));


This creates unnecessary temporary objects. Lazy evaluation can fix this by
translating the expression into an equivalent component by component function,
but it requires alot of template trickery. To see an example of this, check out
this article
http://www.flipcode.com/articles/article_fastervectormath.shtml

Of course, this doesn't work in D since templates aren't implictly instantiated
like C++.  Instead, to get an optimized expression you would have to write the
following,


Vector!(float, 4) a, b, c;
..
for(size_t i=0; i<4; i++)
a[i] = b[i] + c[i] * 5.0;


Writing out loops like this defeats the point of a vector class. I propose a
special attribute for arrays to solve the optimization dilemma and give us a
standard vector.

vector float[4] a, b, c;
..
a = b + c * 5.0;


In addition, having a language level construct lets a high quality
implementation take advantage of specialized SIMD hardware features without
compiler intrinsics.  For example, floats can be aligned on 16-byte boundaries
in order to use the movaps instruction.

Aug 03 2005

Sean Kelly <sean f4.ca> writes:

In article <dcqq3b$1ajd$1 digitaldaemon.com>, mclysenk mtu.edu says...
I agree optimization needs to be QOI, but parallel foreach loops are no
substitute for array operations. What we need is a standard syntax for handling
vector like objects. Examine the following vector class,

class Vector(T, int N)
{
public:

..

Vector opAdd(Vector v)
{
Vector r = new Vector();

for(size_t i=0; i<N; i++)
r[i] = x[i] + v[i];

return r;
}

Vector opMul(T s)
{
Vector r = new Vector();

for(size_t i=0; i<N; i++)
r[i] = x[i] * s;

return r;
}

..

private:
T[N] x;   //Data
}


The first thing to notice in this implementation all the repeated code in each
of the operators. Now consider this expression,

Vector!(float, 4) a, b, c;
..
a = b + c * 5.0;


Translates to:

a = b.opAdd(c.opMul(5.0));


This creates unnecessary temporary objects. Lazy evaluation can fix this by
translating the expression into an equivalent component by component function,
but it requires alot of template trickery. To see an example of this, check out
this article
http://www.flipcode.com/articles/article_fastervectormath.shtml

Yup.  This is covered in detail in "C++ Templates" by Vandevoorde and Josuttis
as well.  I was thinking about this particular issue this morning and realized
that you might be right that using SIMD-type operations is a similar but not
identical goal to lazy evaluation.  The point of lazy evaluation is to avoid the
creation of temporaries (as you've said), as many science applications use
absolutely massive data structures for calculations while SIMD operations are
intended to speed up processing by handling operations in blocks.  I think the
two goals should work together just fine, but it will take some care to avoid
moving copies of double[4] all over the place.

Of course, this doesn't work in D since templates aren't implictly instantiated
like C++.  Instead, to get an optimized expression you would have to write the
following,


Vector!(float, 4) a, b, c;
..
for(size_t i=0; i<4; i++)
a[i] = b[i] + c[i] * 5.0;


Writing out loops like this defeats the point of a vector class. I propose a
special attribute for arrays to solve the optimization dilemma and give us a
standard vector.

vector float[4] a, b, c;
..
a = b + c * 5.0;

I *think* basic operations could be done now if you're willing to live with
using function calls:

a.set(c).mul(5.0).add(b); // infix
add(mul(set(a,c),5.0),b);  // prefix

But this obviously doesn't extend well to complex expressions, where the true
power of lazy evaluation comes into play:

a = (b+c)+(d+e);

In addition, having a language level construct lets a high quality
implementation take advantage of specialized SIMD hardware features without
compiler intrinsics.  For example, floats can be aligned on 16-byte boundaries
in order to use the movaps instruction.

By all means propose something.  It's also worth noting that Walter has hinted
at possibly supporting implicit template instantiation at some point in the
future.  Likely not a 1.0 feature, but something to keep in the back if your
head nevertheless.


Sean

Aug 03 2005

=?iso-8859-1?q?Knud_S=F8rensen?= <12tkvvb02 sneakemail.com> writes:

On Wed, 03 Aug 2005 01:23:14 +0000, mclysenk wrote:

Hi

Try to go to the D wish list and look
if not the "vectorization" suggestion does it for you !
http://all-technology.com/eigenpolls/dwishlist/

Vectorization is currently the 3ed most voted for suggestion
just after "array initialization/literals" and "Reflection API"

Aug 02 2005

Freejack <freejack nowhere.net> writes:

On Wed, 03 Aug 2005 01:23:14 +0000, mclysenk wrote:

 One year ago, I started learning D.  The language is positively lovely, with
 handy builtin dynamic arrays and powerful templates.  And it's fast too.
 
 But, there's one irksome flaw D has in common with every other programming
 language I've encountered: no support for linear algebra. This can be
 implemented in either a module in the standard library or a language extension.
 Often my programs make use of vectors and matrices, since they typically
involve
 some sort of computer graphics.  While it is easy enough to write my own
classes
 to take care of this, I find it irksome that I need to reinvent the wheel so
 many times.
 
 Here are some reasons for a standard vector:
 1.  Vectors are frequently used, especially in games, computer graphics,
physics
 simulations, numerical applications. In some cases, they may even turn up more
 frequently than strings.
 
 2.  Implementing an efficient vector is very sensitive to the specifics of the
 system and the situation in which it is used.  On most modern machines, there
is
 some sort of SIMD hardware built specifically to handle these sorts of
problems,
 but taking advantage of it requires either handcrafted assembly or special
 compiler intrinsics to assure the memory is properly aligned and so on.  The
 other option is lazy evaluation, which can be very difficult to implement.
 In practice, the best approach lies somewhere between the fully vectorized SIMD
 approach and lazy evaluation. On an x86, it is best to use lazy evaluation to
 break the vectors into chunks of 4 floats, then use SIMD operations to solve
 each part.  Implementing this on a case by case basis is rather tedious and
 usually reserved only for extremely critical performance bottlenecks.
 
 3.  There are many redundant implementations of vectors floating around, all of
 which contain essentially the same components.  A standardized implementation
 would make it easier to write common objects like physics engines and graphics
 wrappers that would all work together.
 
 
 I've noticed that on the array page, there is a section on array operations. 
If
 properly implemented, these may be a good start.  However, it would still be
 nice to have a library for higher mathematics as part of a standard library.


When it comes to higher math, I've found that usually easier to use
another tool side by side with my C or D code specifically designed for
that purpose. Most of the time I just spawn J process(www.jsoftware.com)
along with my D application and pass the data back and forth using IPC or
J's C API.
The matrix math and such all run about %60 to %70 faster than when I had
hand coded the routines in C or D.
Trivial array operations don't need to take such an approach, but when
it's needed, it really is needed.

Other than that D kicks ass.

Freejack

Aug 27 2005

Don Clugston <dac nospam.com.au> writes:

Freejack wrote:
 On Wed, 03 Aug 2005 01:23:14 +0000, mclysenk wrote:
 
 
One year ago, I started learning D.  The language is positively lovely, with
handy builtin dynamic arrays and powerful templates.  And it's fast too.

But, there's one irksome flaw D has in common with every other programming
language I've encountered: no support for linear algebra. This can be
implemented in either a module in the standard library or a language extension.
Often my programs make use of vectors and matrices, since they typically involve
some sort of computer graphics.  While it is easy enough to write my own classes
to take care of this, I find it irksome that I need to reinvent the wheel so
many times.

Here are some reasons for a standard vector:
1.  Vectors are frequently used, especially in games, computer graphics, physics
simulations, numerical applications. In some cases, they may even turn up more
frequently than strings.

2.  Implementing an efficient vector is very sensitive to the specifics of the
system and the situation in which it is used.  On most modern machines, there is
some sort of SIMD hardware built specifically to handle these sorts of problems,
but taking advantage of it requires either handcrafted assembly or special
compiler intrinsics to assure the memory is properly aligned and so on.  The
other option is lazy evaluation, which can be very difficult to implement.
In practice, the best approach lies somewhere between the fully vectorized SIMD
approach and lazy evaluation. On an x86, it is best to use lazy evaluation to
break the vectors into chunks of 4 floats, then use SIMD operations to solve
each part.  Implementing this on a case by case basis is rather tedious and
usually reserved only for extremely critical performance bottlenecks.

3.  There are many redundant implementations of vectors floating around, all of
which contain essentially the same components.  A standardized implementation
would make it easier to write common objects like physics engines and graphics
wrappers that would all work together.


I've noticed that on the array page, there is a section on array operations.  If
properly implemented, these may be a good start.  However, it would still be
nice to have a library for higher mathematics as part of a standard library.


I agree, but I think that if the proposed array operations were 
implemented, a mathematics library would not be too difficult to write 
(at least, much simpler than for any other language), and could be 
really fantastic.

As a previous poster said, there are two very different groups of users:

(a) high-speed, low accuracy. Especially for graphics and games, usually 
using 3-D vectors, 4x4 matrices. These will use SIMD if available.

(b) scientific, high accuracy requirements. May use vectors of reals.
Matrices and vectors will be large, and the size is usually unknown at 
compile time.

Group (a) is bigger by far. But I spend most of my time in group (b).
I really think that it is a mistake to think that these groups are 
tightly linked. My scientific physics work normally uses 3D and 4D 
vectors of 80-bit reals. You'd never do that in game physics, and the
presence of SIMD means that the resulting asm code is completely 
different (and therefore, the poor compiler writer has to write 
different optimisers for each group).

I think it would be helpful to state what you'd like to see in such a 
library, because your introductory comments suggest you're in group (a),
which I have a lot less experience with. I suspect they may be easier to 
  implement. Group (b) has the BLAS as a basic standard, I'm not aware 
of any similar standard for group (a).

 When it comes to higher math, I've found that usually easier to use
 another tool side by side with my C or D code specifically designed for
 that purpose. Most of the time I just spawn J process(www.jsoftware.com)
 along with my D application and pass the data back and forth using IPC or
 J's C API.

I had a look. Wow! J would have to be least readable language I've seen 
since Forth! :-)

 The matrix math and such all run about %60 to %70 faster than when I had
 hand coded the routines in C or D.

Something must be wrong. Were you using an optimised BLAS library?
What sort of operations are you performing?

 Trivial array operations don't need to take such an approach, but when
 it's needed, it really is needed.
 
 Other than that D kicks ass.

Yes, but I also think it has many of the ingredients needed to knock the 
socks off everything else in linear algebra, too. Someday.

-Don.

Sep 07 2005

D Programming

C/C++ Programming

Other

digitalmars.D - Linear Algebra