www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - High performance linear algebra?

reply Stephen Waits <steve waits.net> writes:
Hi,

I'm wondering a few things related to this..

1. Does the operator overload mechanism in D disallow writing high 
performance linear algebra classes (Point, Vector, Matrix)?

2. How inlinable (i.e. with DMD) would such classes be?

3. What about the fact that operators always return copies?

4. What about temporaries?  Are they magically optimized away?  :)

5. The current trend in C++ is to use template expressions and template 
metaprogramming to make something like this:

	Vector a, b, c;
	c = a + b * 0.5f;

compile into something like this:

	c[0] = a[0] + b[0] * 0.5f
	c[1] = a[1] + b[1] * 0.5f
	c[2] = a[2] + b[2] * 0.5f

Note the lack of temporaries here.  Is this a possibility in D?  How far 
does DMD go in this direction?

6. The current trend in C++ is also Boost, which is fugly.  Seems like 
lots of people want this included in C++ (a mistake IMO).  Any plans on 
including similar math functionality into D from the beginning - rather 
than as an ugly afterthought?  This could make D a real option in the 
scientific (and game!) programming domains.

--Steve
May 24 2004
next sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
This is one of my major concerns as well. It seems there is quite some way
to go in that respect, but I'm yet trying to understand what would have to
be done to get what is needed. As I see it, there are several issues:

1) Native multi-dimensional arrays. While not strictly necessary for
implementing a linear algebra library, they would certainly help and are
necessary for numerics in general.

2) Array expressions. They are in the specification but not yet implemented.
I have the feeling that the current specification is by far not flexible
enough. There are elementwise operations on arrays, but I see no way to
extend these in a library. Rather than pursuing this limited solution, I
would rather try to find a truly extendable concept for vector operations.
Basic vector operations should be handled by the compiler, but the user
should be able to extend vector operations.

3) Expression templates: These would be a way to do all the high-performance
vector operations in the library (avoiding temporary copies of arrays) as
it is done in Blitz++, Pooma, Boost and other high-performance C++
libraries. This solution clearly is inferior to true vectorizing capability
in the language, since the compiler cannot optimize it for a given machine.
Anyhow, it should still be possible in D, if the language wants to stand up
against C++. Currently, Expression Templates do not seem to be possible, as
two important features are missing:

* implicit instantiation of function templates
* overloadable assignment expression ("OpAssign")

The first one has been discussed over and over again. Walter wants to avoid
the complexity of it. The second one has not come up on this list lately
but has probably been discussed as well. In both cases, we'll probably have
to wait until someone either manages to convince Walter or finds an
alternative solution.

Answering to your specific questions:

 1. Does the operator overload mechanism in D disallow writing high
 performance linear algebra classes (Point, Vector, Matrix)?

No problem there. The language spec contains a minor flaw concerning non-commuting products, but it should be easy to repair.
 2. How inlinable (i.e. with DMD) would such classes be?

Again, no problem. Inlining works just like in C++.
 3. What about the fact that operators always return copies?

You would return a struct which is handled as value. If you want the actual data handled as reference (like, if you have a large array) that struct can simply encapsulate a pointer.
 4. What about temporaries?  Are they magically optimized away?  :)
 
 5. The current trend in C++ is to use template expressions and template
 metaprogramming to make something like this:
 
 Vector a, b, c;
 c = a + b * 0.5f;
 
 compile into something like this:
 
 c[0] = a[0] + b[0] * 0.5f
 c[1] = a[1] + b[1] * 0.5f
 c[2] = a[2] + b[2] * 0.5f
 
 Note the lack of temporaries here.  Is this a possibility in D?  How far
 does DMD go in this direction?

These two questions go along and show the core problem: Expression templates in C++ offer a tricky way to avoid these costly temporary copies. In D, expression templates are not possible yet. Array expressions are not extensible (i.e. cannot be passed to or returned from functions without creating temporaries)
 6. The current trend in C++ is also Boost, which is fugly.  Seems like
 lots of people want this included in C++ (a mistake IMO).  Any plans on
 including similar math functionality into D from the beginning - rather
 than as an ugly afterthought?  This could make D a real option in the
 scientific (and game!) programming domains.

Boost is a huge collection of very different modules for very different purposes. What they have in common is, that they use template meta-programming, with is a rather new programming technique in C++. Compilers are only starting to support this technique and Boost is still taking them to the limit. Using it in C++ is rather awkward, since many of the language features were not designed with template meta-programming in mind. In D, templates were completely redesigned, taking much of the experience gathered with C++. Meta-programming should be possible in a much cleaner way here, and work will probably continue in that direction. Mixins, for example - one of the newest language enhancements - seem to be far more powerful than expected. This all taken together: Whatever is happening in Boost for C++ will probably happen in the standard library for D already. It will take some time until the language is understood well enough to see how the standard library should be designed, but it will certainly exploit the possibilities of D better than the current C++ standard library does for C++.
May 24 2004
parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <c8trcb$q3q$1 digitaldaemon.com>, Norbert Nemec says...

* implicit instantiation of function templates
* overloadable assignment expression ("OpAssign")

The first one has been discussed over and over again. Walter wants to avoid
the complexity of it. The second one has not come up on this list lately

opAssign? Well, I recently suggested the := operator (colon equals), intended to implement copy-by-value, and defined as: a := b; If a is a class, equivalent to a = new A(b), where A is the type of a. If a is a struct or primative type, equivalent to a = b; In the case of a struct, though, it would actally make sense for opAssign to overload either = or := (not sure which one). Kinda pointless with classes though, since that's what constructors are for. All we need is a better syntax for calling them. Come to think of it - it would make MORE sense to allow structs to have constructors (but not destructors). That way, my proposed := could have the same meaning for structs and classes, and opAssign would be completely unnecessary. Arcane Jill
May 25 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Arcane Jill wrote:

 In article <c8trcb$q3q$1 digitaldaemon.com>, Norbert Nemec says...
 
* implicit instantiation of function templates
* overloadable assignment expression ("OpAssign")

The first one has been discussed over and over again. Walter wants to
avoid the complexity of it. The second one has not come up on this list
lately

opAssign?

Doesn't exist yet. My idea would be to have it only for structs. High-performance numerics would probably not make use of classes at all. My concept of this is not finished yet, so I won't go into details at this point.
 Well, I recently suggested the := operator (colon equals), intended to
 implement copy-by-value, and defined as:
 
 a := b;
 If a is a class, equivalent to a = new A(b), where A is the type of a.
 If a is a struct or primative type, equivalent to a = b;

I'm not sure := would be a good idea. As you say it would mostly be syntactic sugar for "= new ...()" which is not too hard to type considering how often it is needed. I think the danger of confusing "=" and ":=" would outweigh the gain by saving a little on typing.
 In the case of a struct, though, it would actally make sense for opAssign
 to overload either = or := (not sure which one). Kinda pointless with
 classes though, since that's what constructors are for. All we need is a
 better syntax for calling them.

Exactly my idea: = on struct would default to elementwise copying unless there is a opAssign defined in the struct.
 Come to think of it - it would make MORE sense to allow structs to have
 constructors (but not destructors).

True, that might even be the better alternative. But still, I'm not convinced := would be a good idea... Ciao, Nobbi
May 25 2004
prev sibling next sibling parent reply Arcane Jill <Arcane_member pathlink.com> writes:
In article <c8ti50$anc$1 digitaldaemon.com>, Stephen Waits says...

1. Does the operator overload mechanism in D disallow writing high 
performance linear algebra classes (Point, Vector, Matrix)?

I'd say so, yes. I'm doing something similar with multiprecision arithmetic. Basically you have two choices - use a struct or use a class. I chose to use a class. Pros and cons are: struct - PRO: copied by value, so a = b; followed by ++b; will not increment a. CON: copied by value, therefore slow. CON: cannot have constructors or destructors class - PRO: copied by reference, therefore fast. CON: copied by reference, so a = b; followed by ++b will increment a. My solution was to use a class, but to disallow the ASSIGNING operator overloads. So, I allow b = b + 1, but not ++b; This uses more temporaries, but I don't believe that neccesarily makes things slower - this is D after all. It has fast allocators and a garbage collector. An interesting alternative option I had considered was to use a struct CONTAINING a reference. That way I could implement copy-by-value semantics with copy-by-reference efficiency. (This takes some thought, but it's do-able). But I wanted constructors and destructors, so I didn't.
2. How inlinable (i.e. with DMD) would such classes be?

D doesn't have the inline keyword, so just assume the compiler will make the most efficient choice.
3. What about the fact that operators always return copies?

They don't. "return this;" is a perfectly good statement. In my own library, if you try to do a = a + 0; or b = b * 1; you'll get back the same reference you fed in.
4. What about temporaries?  Are they magically optimized away?  :)

I think they are magically left lying around on the heap until the garbage collector comes along.
5. The current trend in C++ is to use template expressions and template 
metaprogramming to make something like this:

	Vector a, b, c;
	c = a + b * 0.5f;

compile into something like this:

	c[0] = a[0] + b[0] * 0.5f
	c[1] = a[1] + b[1] * 0.5f
	c[2] = a[2] + b[2] * 0.5f

Note the lack of temporaries here.  Is this a possibility in D?  How far 
does DMD go in this direction?

As noted above, D will magically inline stuff if it's sufficiently efficient to do so, not otherwise. So you MIGHT get that, or you might get a function call. But if the overhead of a function call is insignificant compared with the overhead of + then who cares? Arcane Jill
May 25 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Arcane Jill wrote:

 In article <c8ti50$anc$1 digitaldaemon.com>, Stephen Waits says...
 
1. Does the operator overload mechanism in D disallow writing high
performance linear algebra classes (Point, Vector, Matrix)?

I'd say so, yes. I'm doing something similar with multiprecision arithmetic. Basically you have two choices - use a struct or use a class. I chose to use a class. Pros and cons are: struct - PRO: copied by value, so a = b; followed by ++b; will not increment a. CON: copied by value, therefore slow. CON: cannot have constructors or destructors class - PRO: copied by reference, therefore fast. CON: copied by reference, so a = b; followed by ++b will increment a.

The speed really depends on the size of the data and the use. When inlining functions, much of the copying of data can be optimized away, while the dereferencing will stay. Therefore structs will often be faster than classes, even if the data is a little larger than just one pointer. (Not too large, of course)
4. What about temporaries?  Are they magically optimized away?  :)

I think they are magically left lying around on the heap until the garbage collector comes along.
5. The current trend in C++ is to use template expressions and template
metaprogramming to make something like this:

Vector a, b, c;
c = a + b * 0.5f;

compile into something like this:

c[0] = a[0] + b[0] * 0.5f
c[1] = a[1] + b[1] * 0.5f
c[2] = a[2] + b[2] * 0.5f

Note the lack of temporaries here.  Is this a possibility in D?  How far
does DMD go in this direction?

As noted above, D will magically inline stuff if it's sufficiently efficient to do so, not otherwise. So you MIGHT get that, or you might get a function call. But if the overhead of a function call is insignificant compared with the overhead of + then who cares?

Judging by these comments, I am not sure whether you, AJ, are aware of the issues of "High performance linear algebra". Creating objects on the heap is fairly efficient, but for hard-core number-crunching numerics it would be a killer. Temporaries like those mentioned by Stephen cannot simply be optimized away by a compiler, but are a fundamental problem unless you play some really serious tricks. Maybe you want to have a read at oonumerics.org: A good starting point would be the documentation of blitz++: http://www.oonumerics.org/blitz/ There are a few papers that really describe the issues of combining the expressiveness of object oriented programming with the need for utmost performance.
May 25 2004
parent reply Stephen Waits <steve waits.net> writes:
Norbert Nemec wrote:
 
 be a killer. Temporaries like those mentioned by Stephen cannot simply be
 optimized away by a compiler, but are a fundamental problem unless you play
 some really serious tricks. Maybe you want to have a read at

This is very true.. Thanks guys for the responses. A few more comments. Ideally, D would allow us to easily take advantage of vectorized instructions. SIMD is your friend here, whether it be AltiVec, SSE, MMX, 3DNow, or whatever. This has been mostly achieved in C++ through the various template molestations. But it's ugly. REALLY UGLY. I plan on doing some experiments in D in the near future to determine the feasibility of this. At first glance, I suspect the performance won't be there. In this day and age, nobody is willing to give up their pretty algebraic interfaces to linear algebra libraries.. VA = VB + BC instead of VecAdd(VA,VB,VC). We want the nice algebraic interface AND the performance. The cherry on top would be if we could get some elegance to go along it! D, being somewhat beautifully designed, and still in its infancy, has a real opportunity here. It could be THE language which displaces FORTRAN and C++ in the scientific and math programming communities - if this type of thing is considered and BUILT INTO the language from day one. The problem with LA libs in C++ is that they're a hack, and an afterthought, and it's a crap shoot as to how any given compiler will behave. Ick... As for my own personal interest? I program games for a living, and would very seriously consider D for at least some modules on a future title - if the performance were there. Thanks, Steve P.S. As an interesting side note, Codeplay has a C++ compiler for several platforms with vectorization in mind. (http://www.codeplay.com/). They also have PS2 (both EE & VU) versions, but it never seemed to catch on. Instead most people are apparently using "nasty C++ tricks" [my term].
May 25 2004
parent =?iso-8859-1?q?Knud_S=F8rensen?= <knud NetRunner.all-technology.com> writes:
Hi

I where working on some array suggestions here
http://dlanguage.netunify.com/56

it is not nearly finished but 
when we are talking on the subject 
I better publish it.

It would be nice if you could help
out by adding ideas, links and options.

So, we have all suggestions for D high performance linear algebra
in one place.

Knud
May 25 2004
prev sibling parent reply Stephen Waits <steve waits.net> writes:
Just thought I'd see if anyone else has any more thoughts on this?  It's 
something that's really bugging me about D, and I see no solutions 
popping up.

Thanks,
Steve
Jun 07 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Stephen Waits wrote:

 Just thought I'd see if anyone else has any more thoughts on this?  It's
 something that's really bugging me about D, and I see no solutions
 popping up.

I've been spending quite some thought on it, but there are a number of individual steps that have to be taken one-by-one: 1) Multidimensional arrays (proposal is online, but it will take probably a while to get it to some usable state) 2) Array expressions - somehow I have the feeling, the current specs (which are not yet implemented at all) are by far not powerful enough to offer serious vector programming. Especially, vectorization has to be possible across functions: ---------------- real[] add(real[] a,real[] b) { return a + b; } real[] a = ...; real[] b = ...; real[] c = a + b; ---------------- I have no idea whether such a thing is possible at all, and whether it can be inlined without a temporary array. 3) Alternatively: array expressions, which would allow to do everything in the library. For these, a number of features would be needed. Most importantly: implicit instantiation of functions. 4) Once we are there, a library will have to be written to use these vectorization techniques for linear algebra. Of course, such a library could be written, but the lack of (1) would make it rather clumsy to use and the lack of either (2) or (3) will not allow vectorization of complex expressions. (i.e. data has always to be stored in temporary arrays to return it from functions.)
Jun 07 2004
parent Stephen Waits <steve waits.net> writes:
Norbert Nemec wrote:
 
 I've been spending quite some thought on it, but there are a number of
 individual steps that have to be taken one-by-one:

Thanks for the info Norbert. Until some of this gets sorted I don't believe D is a valid consideration for real-time 3D stuff. --Steve
Jun 10 2004