www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.linalg

reply "FreeSlave" <freeslave93 gmail.com> writes:
There is "Matrices and linear algebra" module in wish list. Let's 
discuss its design. D is complicated language so it's difficult 
to choose the right way here. We need to find compromise between 
efficiency and convenient interface. I'm going to make some 
suggestions how this module should look like.

First of all, it should provide two templates for matrices. Let's 
call them StaticMatrix and DynamicMatrix. The first one has 
"templated" size and therefore may use static arrays and 
compile-time checks. It can be useful when the size is determined 
by our needs, for example, in graphics. DynamicMatrix has 
variable size, i.e. it should be created in heap. It can be 
useful in all other math areas.

Both templates should support all floating point types and 
moreover user-defined (for example wrappers for GMP library and 
others).

For efficiency in both cases matrices should use one-dimensional 
array for inner representation. But actually I'm not sure if 
matrices should support other container types besides standard D 
arrays. The good thing about one-dimensional arrays is that they 
can be easily exposed to foreign functions, for example, to C 
libraries and OpenGL. So we should take care about memory layout 
- at least row-major and column-major. I think it can be 
templated too.

But another question arises - which "majority" should we use in 
interface? Interface should not depend on inner representation. 
All functions need unambiguity to avoid complication and 
repetition of design. Well, actually we can deal with different 
majority in interface - we can provide something like 
"asTransposed" adapter, that will be applied by functions if 
needed, but then we will force user to check majority of matrix 
interface, it's not very good approach.

Sometimes user takes data from some other source and wants to 
avoid copying in Matrix construction, but she also wants to get 
matrix functionality. So we should provide "arrayAsMatrix" 
adapter, that can adopt one-dimensional and two-dimensional 
arrays making them feel like matrices. It definitely should not 
make copy of dynamic array, but I'm not sure about static.

About operation overloading. It's quite clear about 'add' and 
'subtruct' operations, but what's about product? Here I think all 
'op'-functions should be 'element by element' operations. So we 
can use all other operations too without ambiguity. For actual 
matrix multiplication it can provide 'multiply' or 'product' 
function. It's similar to Maxima approach, besides Maxima uses 
dot notation for these needs.

Transposition. I've already mentioned "asTransposed" adapter. It 
should be useful to make matrix feel like transposed without its 
copying. We also can implement 'transpose' and 'transposed' 
functions. The first one transposes matrix in place. It's 
actually not allowed for non-square StaticMatrix since we can't 
change the size of this type of matrices at runtime. The second 
one returns copy so it's applicable in all cases. Actually I'm 
not sure should these functions be member functions or not.

Invertible matrix. It must not be allowed for square 
StaticMatrix. DynamicMatrix may make checks at runtime. Same as 
for transposition we can implement 'invert' to invert in place 
and 'inverted' to make copy. There is issue here - what should we 
do when determinant is 0? I believe the best approach here is to 
throw exception since if user needs invertible matrix it's 
definitely exception when it can't be calculated.

Please, make your suggestions too.
Oct 11 2013
next sibling parent reply "ponce" <contact gam3sfrommmmmars.fr> writes:
Good idea since there is so much implementations of "fixed-width 
vectors/matrices" since the beginning of times. There has been 
such efforts in the past to makes it more standardized.

There will be

On Friday, 11 October 2013 at 16:10:21 UTC, FreeSlave wrote:
 Both templates should support all floating point types and 
 moreover user-defined (for example wrappers for GMP library and 
 others).
And integers for eg. video processing. Thankfully it's not that much harder.
 For efficiency in both cases matrices should use 
 one-dimensional array for inner representation. But actually 
 I'm not sure if matrices should support other container types 
 besides standard D arrays. The good thing about one-dimensional 
 arrays is that they can be easily exposed to foreign functions, 
 for example, to C libraries and OpenGL. So we should take care 
 about memory layout - at least row-major and column-major. I 
 think it can be templated too.
About 3D graphics: I don't think choosing one or the other is that big a deal. I tend to prefer row major order but eg. gl3n uses the reverse. It's up to you. If you templated against order, you will be able to implement you "asTransposed" idea.
 Sometimes user takes data from some other source and wants to 
 avoid copying in Matrix construction, but she also wants to get 
 matrix functionality. So we should provide "arrayAsMatrix" 
 adapter, that can adopt one-dimensional and two-dimensional 
 arrays making them feel like matrices. It definitely should not 
 make copy of dynamic array, but I'm not sure about static.
Possible with pointer cast? (strict aliasing rule aside)
 About operation overloading. It's quite clear about 'add' and 
 'subtruct' operations, but what's about product? Here I think 
 all 'op'-functions should be 'element by element' operations.
The product to me is a special case. Element-wise matrix multiplication is seldom used, it's a lot more common to want normal matrix multiplication. I and others use "*" for that purpose. https://github.com/Dav1dde/gl3n/blob/master/gl3n/linalg.d#L1734 https://github.com/p0nce/gfm/blob/master/gfm/math/matrix.d#L150 https://github.com/gecko0307/dlib/blob/master/dlib/math/matrix.d#L171
 So we can use all other operations too without ambiguity. For 
 actual matrix multiplication it can provide 'multiply' or 
 'product' function. It's similar to Maxima approach, besides 
 Maxima uses dot notation for these needs.
But it's dissimilar to shader languages and existing D libraries.
 Invertible matrix. It must not be allowed for square 
 StaticMatrix. DynamicMatrix may make checks at runtime. Same as 
 for transposition we can implement 'invert' to invert in place 
 and 'inverted' to make copy. There is issue here - what should 
 we do when determinant is 0?
Use enforce to crash. Dividing by zero is not recoverable.
Oct 11 2013
parent "FreeSlave" <freeslave93 gmail.com> writes:
On Friday, 11 October 2013 at 17:08:26 UTC, ponce wrote:
 Good idea since there is so much implementations of 
 "fixed-width vectors/matrices" since the beginning of times. 
 There has been such efforts in the past to makes it more 
 standardized.

 There will be

 On Friday, 11 October 2013 at 16:10:21 UTC, FreeSlave wrote:
 Both templates should support all floating point types and 
 moreover user-defined (for example wrappers for GMP library 
 and others).
And integers for eg. video processing. Thankfully it's not that much harder.
 For efficiency in both cases matrices should use 
 one-dimensional array for inner representation. But actually 
 I'm not sure if matrices should support other container types 
 besides standard D arrays. The good thing about 
 one-dimensional arrays is that they can be easily exposed to 
 foreign functions, for example, to C libraries and OpenGL. So 
 we should take care about memory layout - at least row-major 
 and column-major. I think it can be templated too.
About 3D graphics: I don't think choosing one or the other is that big a deal. I tend to prefer row major order but eg. gl3n uses the reverse. It's up to you. If you templated against order, you will be able to implement you "asTransposed" idea.
 Sometimes user takes data from some other source and wants to 
 avoid copying in Matrix construction, but she also wants to 
 get matrix functionality. So we should provide "arrayAsMatrix" 
 adapter, that can adopt one-dimensional and two-dimensional 
 arrays making them feel like matrices. It definitely should 
 not make copy of dynamic array, but I'm not sure about static.
Possible with pointer cast? (strict aliasing rule aside)
Well, pointer cast would work for one-dimensional arrays, but not for dynamic two-dimensional matrices. Anyway it's just implementation specific, not design.
 About operation overloading. It's quite clear about 'add' and 
 'subtruct' operations, but what's about product? Here I think 
 all 'op'-functions should be 'element by element' operations.
The product to me is a special case. Element-wise matrix multiplication is seldom used, it's a lot more common to want normal matrix multiplication. I and others use "*" for that purpose. https://github.com/Dav1dde/gl3n/blob/master/gl3n/linalg.d#L1734 https://github.com/p0nce/gfm/blob/master/gfm/math/matrix.d#L150 https://github.com/gecko0307/dlib/blob/master/dlib/math/matrix.d#L171
 So we can use all other operations too without ambiguity. For 
 actual matrix multiplication it can provide 'multiply' or 
 'product' function. It's similar to Maxima approach, besides 
 Maxima uses dot notation for these needs.
But it's dissimilar to shader languages and existing D libraries.
That's actual a common issue to use operator overloading for multiplication or not. Some people does not like it, others prefer this way. That's why I want to discuss it. "*" is more practical, but user always expects "/" as opposite. Matrix has no division operation (except of dividing by scalar), only multiplication by invertible matrix.
 Invertible matrix. It must not be allowed for square 
 StaticMatrix.
I meant non-square of course. Sorry for this mistake. I don't know how to edit my posts here. The another issue is getting of minors. It can use some operator overload magic and look like matrix(start, middle1, middle2, end)(start, middle1, middle2, end) where the first () stands for rows and the second () stands for columns. Gap is between middle1 and middle2. There is efficiency issue - should it make copy or work like some kind of two-dimensional range?
Oct 11 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 11, 2013 at 06:10:19PM +0200, FreeSlave wrote:
 There is "Matrices and linear algebra" module in wish list. Let's
 discuss its design. D is complicated language so it's difficult to
 choose the right way here. We need to find compromise between
 efficiency and convenient interface. I'm going to make some
 suggestions how this module should look like.
I think we need to differentiate between multidimensional arrays (as a data storage type) and linear algebra (operations performed on 2D arrays). These two intersect, but they also have areas that are not compatible with each other (e.g. matrix product vs. element-by-element product). Ideally, we should support both in a clean way. As far as the former is concerned, Denis has implemented a multidimensional array library, and I've independently done the same, with a slightly different interface. I think one or two others have implemented similar libraries as well. It would be good if we standardize the API so that our code can become interoperable. As far as the latter is concerned, I've been meaning to implement a double-description convex hull algorithm, but have been too busy to actually work on it. This particular algorithm is interesting, because it stress-tests (1) performance of D algorithms, and (2) challenges the design of matrix APIs because while the input vertices (resp. hyperplanes) can be interpreted as a matrix, the algorithm itself also needs to permute rows, which means it is most efficient when given an array-of-pointers representation, contrary to the usual flattened representations (as proposed below). I think there's a place for both, which is why we need to distinguish between data representation and the algorithms that work on them.
 First of all, it should provide two templates for matrices. Let's call
 them StaticMatrix and DynamicMatrix. The first one has "templated"
 size and therefore may use static arrays and compile-time checks. It
 can be useful when the size is determined by our needs, for example,
 in graphics. DynamicMatrix has variable size, i.e. it should be
 created in heap. It can be useful in all other math areas.
I like this idea. Ideally, we should have many possible representations, but all conforming to a single API understood by all algorithms, so that you only have to write algorithms once, and they will work with any data structure. That's one key advantage of D, and we should make good use of it. I do not want to see a repetition of the C++ situation where there are so many different matrix/multidimensional array libraries, and all of them use incompatible representations and you cannot freely pass data from one to algorithms in the other. Then when no library meets precisely what you need, you're forced to reinvent yet another matrix class, which is a waste of time.
 Both templates should support all floating point types and moreover
 user-defined (for example wrappers for GMP library and others).
Definitely, yes.
 For efficiency in both cases matrices should use one-dimensional
 array for inner representation. But actually I'm not sure if
 matrices should support other container types besides standard D
 arrays. The good thing about one-dimensional arrays is that they can
 be easily exposed to foreign functions, for example, to C libraries
 and OpenGL. So we should take care about memory layout - at least
 row-major and column-major. I think it can be templated too.
We should not tie algorithms to specific data representations (concrete types). One key advantage of D is that you can write algorithms generically, such that they can work with *any* type as long as it conforms to a standard API. One excellent example is the range API: *anything* that conforms to the range API can be used with std.algorithm, not just a specific representation. In fact, std.range provides a whole bunch of different ranges and range wrappers, and all of them automatically can be used with std.algorithm, because the code in std.algorithm uses only the range API and never (at least in theory :P) depends on concrete types. We should take advantage of this feature. It would be good, of course, to provide some standard, commonly-used representations, for example row-major (or column-major) matrix classes / structs, etc.. But the algorithms should not directly depend on these concrete types. An algorithm that works with a matrix stored as a 1D array should also work with a matrix stored as a nested array of arrays, as well as a sparse matrix representation that uses some other kind of storage mechanism. As long as a type conforms to some standard matrix API, it should Just Work(tm) with any std.linalg algorithm.
 But another question arises - which "majority" should we use in
 interface? Interface should not depend on inner representation. All
 functions need unambiguity to avoid complication and repetition of
 design. Well, actually we can deal with different majority in
 interface - we can provide something like "asTransposed" adapter, that
 will be applied by functions if needed, but then we will force user to
 check majority of matrix interface, it's not very good approach.
Algorithms shouldn't even care what majority the data representation is in. It should only access data via the standardized matrix API (whatever it is we decide on). The input type should be templated so that *any* type that conforms to this API will work. Of course, for performance-sensitive code, the user should be aware of which representations are best-performing, and make sure to pass in the appropriate type of representations; but we should not prematurely optimize here. Any linear algebra algorithms should be able to work with *any* type that conforms to a standard matrix API.
 Sometimes user takes data from some other source and wants to avoid
 copying in Matrix construction, but she also wants to get matrix
 functionality. So we should provide "arrayAsMatrix" adapter, that
 can adopt one-dimensional and two-dimensional arrays making them
 feel like matrices. It definitely should not make copy of dynamic
 array, but I'm not sure about static.
If a function expects a 1xN matrix, we should be able to pass in an array and it should Just Work. Manually using adapters should not be needed. Of course, standard concrete matrix types provided by the library should have ctors / factory methods for initializing a matrix object that uses some input array as initial data -- if we design this correctly, it should be a cheap operation (the matrix type itself should just be a thin wrapper over the array to provide methods that conform to the standard matrix API). Then if some function F requires a matrix object, we should be able to just create a Matrix instance with our input array as initial data, and pass it to F.
 About operation overloading. It's quite clear about 'add' and
 'subtruct' operations, but what's about product? Here I think all
 'op'-functions should be 'element by element' operations. So we can
 use all other operations too without ambiguity. For actual matrix
 multiplication it can provide 'multiply' or 'product' function. It's
 similar to Maxima approach, besides Maxima uses dot notation for these
 needs.
Here is where we see the advantage of separating representation from algorithm. Technically, a matrix is not the same thing as a 2D array, because a matrix has a specific interpretation in linear algebra, whereas a 2D array is just a 2D container of some elements. My suggestion would be to write a Matrix struct that wraps around a 2D array, and provides / overrides the overloaded operators to have a linear algebra interpretation. So, a 2D array type should have per-element operations, but once wrapped in a Matrix struct, it will acquire special matrix algebra operations like matrix products, inversion, etc.. In the most general case, a 2D array should be a specific instance of a multidimensional array, and a Matrix struct should be able to use any underlying representation that conforms to a 2D array API. For example: // Example of a generic multidimensional array type struct Array(int dimension, ElemType) { ... Array opBinary(string op)(Array x) { // implement per-element operations here } } // A matrix wrapper around a 2D array type. struct Matrix(T) if (is2DArray!T) { T representation; Matrix opBinary(string op)(Matrix x) if (op == "*") { // implement matrix multiplication here } Matrix opBinary(string op)(Matrix x) if (op != "*") { // forward to representation.opBinary to default // to per-element operations } // Provide operations specific to matrices that don't // exist in general multidimensional arrays. Matrix invert() { ... } } Array!(2,float) myArray, myOtherArray; auto arrayProd = myArray * myOtherArray; // per-element multiplication auto A = Matrix(myArray); // wrap array in Matrix wrapper auto B = Matrix(myOtherArray); auto C = A * B; // matrix product The idea of the Matrix struct here is that the user should be free to choose any underlying matrix representation: a 1D array in row-major or column-major representation, or a nested array of arrays, or a sparse array with some other kind of representation. As long as they provide a standard way of accessing array elements, Matrix should be able to accept them, and provide matrix algebra semantics for them.
 Transposition. I've already mentioned "asTransposed" adapter. It
 should be useful to make matrix feel like transposed without its
 copying. We also can implement 'transpose' and 'transposed'
 functions. The first one transposes matrix in place. It's actually
 not allowed for non-square StaticMatrix since we can't change the
 size of this type of matrices at runtime. The second one returns
 copy so it's applicable in all cases. Actually I'm not sure should
 these functions be member functions or not.
The most generic approach to transposition is simply a reordering of indices. This difference is important once you get to 3D arrays and beyond, because then there is no unique transpose, but any permutation of array indices should be permissible. Denis' multidimensional arrays have a method that does O(1) reordering of array indices: basically, you create a "view" of the original array that has its indices swapped around. So there is no data copying; it's just a different "view" into the same underlying data. This approach of using "views" rather than copying data allows for O(1) submatrix extraction: if you have a 50x50 matrix, then you can take arbitrary 10x10 submatrices of it without needing to copy any of the data, which would be very expensive. Avoiding unnecessary copying becomes very important when the dimension of the array increases; if you have a 3D or 5D array, copying subarrays become extremely expensive very quickly. A .dup method should be provided in the cases where you actually *want* to copy the data, of course. Basically, subarrays / transpositions / index reordering should be regarded as generalizations of D's array slices. No data should be copied until necessary.
 Invertible matrix. It must not be allowed for square StaticMatrix.
You mean for non-square StaticMatrix?
 DynamicMatrix may make checks at runtime. Same as for transposition
 we can implement 'invert' to invert in place and 'inverted' to make
 copy. There is issue here - what should we do when determinant is 0?
 I believe the best approach here is to throw exception since if user
 needs invertible matrix it's definitely exception when it can't be
 calculated.
[...] Here I'd just like to say that inversion is specific to linear algebra, and doesn't apply to multidimensional arrays in general. So it should be implemented in the Matrix wrapper, not tied to any specific data representation. As for trying to invert non-invertible matrices: it's just another form of "division by zero". I'd say throwing an exception is the right thing to do here. Of course, you might want to provide an alternative API that checks for invertibility when the caller is not sure whether it can be inverted: since this check pretty much already computes the inverse if it exists, it's probably more efficient to have a single method of doing both. Since it's probably cheaper to do this inversion / check in-place, you could just have a method that takes a Matrix by ref: struct Matrix(T) { ... // Returns true if m is invertible, in which case m will // be replaced with its inverse; otherwise returns // false, and the contents of m will be scrambled. Make // a copy of m if you wish the preserve its original // value. static bool invert(Matrix m) { ... } ... } This is a sort of low-level operation for performance-sensitive code, so the "nicer" invert operation can be implemented in terms of this one: struct Matrix(T) { ... static bool invert(Matrix m) { ... } ... property Matrix inverse() { auto result = this.dup; // make a copy first if (!invert(result)) throw new Exception("Tried to invert non-invertible matrix"); return result; } } Matrix!(Array!(2,float)) A = ...; auto B = A.inverse; // this returns a separate inverse matrix A.invert(A); // this modifies A in-place to become its inverse Of course, these method names are tentative; I'm sure you can think of better names for them. T -- Computers shouldn't beep through the keyhole.
Oct 11 2013
parent reply "FreeSlave" <freeslave93 gmail.com> writes:
On Friday, 11 October 2013 at 17:49:32 UTC, H. S. Teoh wrote:
 On Fri, Oct 11, 2013 at 06:10:19PM +0200, FreeSlave wrote:
 There is "Matrices and linear algebra" module in wish list. 
 Let's
 discuss its design. D is complicated language so it's 
 difficult to
 choose the right way here. We need to find compromise between
 efficiency and convenient interface. I'm going to make some
 suggestions how this module should look like.
I think we need to differentiate between multidimensional arrays (as a data storage type) and linear algebra (operations performed on 2D arrays). These two intersect, but they also have areas that are not compatible with each other (e.g. matrix product vs. element-by-element product). Ideally, we should support both in a clean way. As far as the former is concerned, Denis has implemented a multidimensional array library, and I've independently done the same, with a slightly different interface. I think one or two others have implemented similar libraries as well. It would be good if we standardize the API so that our code can become interoperable.
Can you please give links to both libraries?
 As far as the latter is concerned, I've been meaning to 
 implement a
 double-description convex hull algorithm, but have been too 
 busy to
 actually work on it. This particular algorithm is interesting, 
 because
 it stress-tests (1) performance of D algorithms, and (2) 
 challenges the
 design of matrix APIs because while the input vertices (resp.
 hyperplanes) can be interpreted as a matrix, the algorithm 
 itself also
 needs to permute rows, which means it is most efficient when 
 given an
 array-of-pointers representation, contrary to the usual 
 flattened
 representations (as proposed below). I think there's a place 
 for both,
 which is why we need to distinguish between data representation 
 and the
 algorithms that work on them.


 First of all, it should provide two templates for matrices. 
 Let's call
 them StaticMatrix and DynamicMatrix. The first one has 
 "templated"
 size and therefore may use static arrays and compile-time 
 checks. It
 can be useful when the size is determined by our needs, for 
 example,
 in graphics. DynamicMatrix has variable size, i.e. it should be
 created in heap. It can be useful in all other math areas.
I like this idea. Ideally, we should have many possible representations, but all conforming to a single API understood by all algorithms, so that you only have to write algorithms once, and they will work with any data structure. That's one key advantage of D, and we should make good use of it.
The problem is that algorithms still should know matrix template to provide compile-time checks if possible or throw exceptions at runtime if something gone wrong.
 I do not want to see a repetition of the C++ situation where 
 there are
 so many different matrix/multidimensional array libraries, and 
 all of
 them use incompatible representations and you cannot freely 
 pass data
 from one to algorithms in the other. Then when no library meets
 precisely what you need, you're forced to reinvent yet another 
 matrix
 class, which is a waste of time.



 Both templates should support all floating point types and 
 moreover
 user-defined (for example wrappers for GMP library and others).
Definitely, yes.
 For efficiency in both cases matrices should use 
 one-dimensional
 array for inner representation. But actually I'm not sure if
 matrices should support other container types besides standard 
 D
 arrays. The good thing about one-dimensional arrays is that 
 they can
 be easily exposed to foreign functions, for example, to C 
 libraries
 and OpenGL. So we should take care about memory layout - at 
 least
 row-major and column-major. I think it can be templated too.
We should not tie algorithms to specific data representations (concrete types). One key advantage of D is that you can write algorithms generically, such that they can work with *any* type as long as it conforms to a standard API. One excellent example is the range API: *anything* that conforms to the range API can be used with std.algorithm, not just a specific representation. In fact, std.range provides a whole bunch of different ranges and range wrappers, and all of them automatically can be used with std.algorithm, because the code in std.algorithm uses only the range API and never (at least in theory :P) depends on concrete types. We should take advantage of this feature. It would be good, of course, to provide some standard, commonly-used representations, for example row-major (or column-major) matrix classes / structs, etc.. But the algorithms should not directly depend on these concrete types. An algorithm that works with a matrix stored as a 1D array should also work with a matrix stored as a nested array of arrays, as well as a sparse matrix representation that uses some other kind of storage mechanism. As long as a type conforms to some standard matrix API, it should Just Work(tm) with any std.linalg algorithm.
 But another question arises - which "majority" should we use in
 interface? Interface should not depend on inner 
 representation. All
 functions need unambiguity to avoid complication and 
 repetition of
 design. Well, actually we can deal with different majority in
 interface - we can provide something like "asTransposed" 
 adapter, that
 will be applied by functions if needed, but then we will force 
 user to
 check majority of matrix interface, it's not very good 
 approach.
Algorithms shouldn't even care what majority the data representation is in. It should only access data via the standardized matrix API (whatever it is we decide on). The input type should be templated so that *any* type that conforms to this API will work. Of course, for performance-sensitive code, the user should be aware of which representations are best-performing, and make sure to pass in the appropriate type of representations; but we should not prematurely optimize here. Any linear algebra algorithms should be able to work with *any* type that conforms to a standard matrix API.
I'm not sure if you understand idea of differences between inner implementation majority and interface majority. I agree that inner majority should be defined by inner type. Interface majority is just choice between matrix[rowIndex, columnIndex] and matrix[columnIndex, rowIndex] In case of interface majority we just must choose the appropriate one and use it all over the library. It does not relate to performance.
 Sometimes user takes data from some other source and wants to 
 avoid
 copying in Matrix construction, but she also wants to get 
 matrix
 functionality. So we should provide "arrayAsMatrix" adapter, 
 that
 can adopt one-dimensional and two-dimensional arrays making 
 them
 feel like matrices. It definitely should not make copy of 
 dynamic
 array, but I'm not sure about static.
If a function expects a 1xN matrix, we should be able to pass in an array and it should Just Work. Manually using adapters should not be needed. Of course, standard concrete matrix types provided by the library should have ctors / factory methods for initializing a matrix object that uses some input array as initial data -- if we design this correctly, it should be a cheap operation (the matrix type itself should just be a thin wrapper over the array to provide methods that conform to the standard matrix API). Then if some function F requires a matrix object, we should be able to just create a Matrix instance with our input array as initial data, and pass it to F.
 About operation overloading. It's quite clear about 'add' and
 'subtruct' operations, but what's about product? Here I think 
 all
 'op'-functions should be 'element by element' operations. So 
 we can
 use all other operations too without ambiguity. For actual 
 matrix
 multiplication it can provide 'multiply' or 'product' 
 function. It's
 similar to Maxima approach, besides Maxima uses dot notation 
 for these
 needs.
Here is where we see the advantage of separating representation from algorithm. Technically, a matrix is not the same thing as a 2D array, because a matrix has a specific interpretation in linear algebra, whereas a 2D array is just a 2D container of some elements. My suggestion would be to write a Matrix struct that wraps around a 2D array, and provides / overrides the overloaded operators to have a linear algebra interpretation. So, a 2D array type should have per-element operations, but once wrapped in a Matrix struct, it will acquire special matrix algebra operations like matrix products, inversion, etc.. In the most general case, a 2D array should be a specific instance of a multidimensional array, and a Matrix struct should be able to use any underlying representation that conforms to a 2D array API. For example: // Example of a generic multidimensional array type struct Array(int dimension, ElemType) { ... Array opBinary(string op)(Array x) { // implement per-element operations here } } // A matrix wrapper around a 2D array type. struct Matrix(T) if (is2DArray!T) { T representation; Matrix opBinary(string op)(Matrix x) if (op == "*") { // implement matrix multiplication here } Matrix opBinary(string op)(Matrix x) if (op != "*") { // forward to representation.opBinary to default // to per-element operations } // Provide operations specific to matrices that don't // exist in general multidimensional arrays. Matrix invert() { ... } } Array!(2,float) myArray, myOtherArray; auto arrayProd = myArray * myOtherArray; // per-element multiplication auto A = Matrix(myArray); // wrap array in Matrix wrapper auto B = Matrix(myOtherArray); auto C = A * B; // matrix product The idea of the Matrix struct here is that the user should be free to choose any underlying matrix representation: a 1D array in row-major or column-major representation, or a nested array of arrays, or a sparse array with some other kind of representation. As long as they provide a standard way of accessing array elements, Matrix should be able to accept them, and provide matrix algebra semantics for them.
 Transposition. I've already mentioned "asTransposed" adapter. 
 It
 should be useful to make matrix feel like transposed without 
 its
 copying. We also can implement 'transpose' and 'transposed'
 functions. The first one transposes matrix in place. It's 
 actually
 not allowed for non-square StaticMatrix since we can't change 
 the
 size of this type of matrices at runtime. The second one 
 returns
 copy so it's applicable in all cases. Actually I'm not sure 
 should
 these functions be member functions or not.
The most generic approach to transposition is simply a reordering of indices. This difference is important once you get to 3D arrays and beyond, because then there is no unique transpose, but any permutation of array indices should be permissible. Denis' multidimensional arrays have a method that does O(1) reordering of array indices: basically, you create a "view" of the original array that has its indices swapped around. So there is no data copying; it's just a different "view" into the same underlying data. This approach of using "views" rather than copying data allows for O(1) submatrix extraction: if you have a 50x50 matrix, then you can take arbitrary 10x10 submatrices of it without needing to copy any of the data, which would be very expensive. Avoiding unnecessary copying becomes very important when the dimension of the array increases; if you have a 3D or 5D array, copying subarrays become extremely expensive very quickly. A .dup method should be provided in the cases where you actually *want* to copy the data, of course. Basically, subarrays / transpositions / index reordering should be regarded as generalizations of D's array slices. No data should be copied until necessary.
 Invertible matrix. It must not be allowed for square 
 StaticMatrix.
You mean for non-square StaticMatrix?
Yes, non-square. My bad. Well, ok. We want to abstract from inner representation to provide freedom for users. We fall in metaprogramming and generic programming here, so we need to define concepts just like Boost/STL/std.range do. The good thing is that in D types with different interfaces and syntax constraints can satisfy same concept that would be impossible or very difficult in C++. Thanks to static if and is(typeof()). For example inner representation type can provide [][] operator or [,] operator and Matrix type will understand both cases. Suppose: template canBeMatrixRepresentation(T) { enum bool canBeMatrixRepresentation = is(typeof( { T t; //default constructable const(T) ct; alias ElementType!T E; //has ElementType E e; //element type is default constructable static if (/*has [,] operator*/) { t[0,0] = e; //can be assigned e = ct[0,0]; //can retrive element value from const(T) } else static if (/*has [][] operator*/) { t[0][0] = e; //can be assigned e = ct[0][0]; //can retrive element value from const(T) } else { static assert(false); } size_t rows = ct.rowNum; //has row number size_t cols = ct.columnNum; //has column number t.rowNum = size_t.init; t.columnNum = size_t.init; })); } We see that two-dimensional D array does not satisfy this concept because it has no rowNum and columnNum so it should be handled separately. This concept is not ideal since not all types may provide variable rowNum and columnNum. Also concept should expose information whether is "static" type or not, so algorithms will know can they use compile-time checks or not. Types also can provide copy constructor. If they do then Matrix will use it, if they don't then Matrix will do element-by-element copy. It also can try .dup property. It's just example how it should work, but I hope the point is clear. We also need Matrix concept (or separate concepts for StaticMatrix and DynamicMatrix).
Oct 11 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 11, 2013 at 09:49:22PM +0200, FreeSlave wrote:
 On Friday, 11 October 2013 at 17:49:32 UTC, H. S. Teoh wrote:
On Fri, Oct 11, 2013 at 06:10:19PM +0200, FreeSlave wrote:
There is "Matrices and linear algebra" module in wish list.  Let's
discuss its design. D is complicated language so it's difficult to
choose the right way here. We need to find compromise between
efficiency and convenient interface. I'm going to make some
suggestions how this module should look like.
I think we need to differentiate between multidimensional arrays (as a data storage type) and linear algebra (operations performed on 2D arrays). These two intersect, but they also have areas that are not compatible with each other (e.g. matrix product vs. element-by-element product). Ideally, we should support both in a clean way. As far as the former is concerned, Denis has implemented a multidimensional array library, and I've independently done the same, with a slightly different interface. I think one or two others have implemented similar libraries as well. It would be good if we standardize the API so that our code can become interoperable.
Can you please give links to both libraries?
I seem to have lost the link to Denis' code, but you can see the docs here: http://denis-sh.bitbucket.org/unstandard/unstd.multidimensionalarray.html I haven't put my code up in any public repo (yet), because it's not quite ready for public consumption. But you can see the docs (somewhat older version) here: http://eusebeia.dyndns.org/~hsteoh/tmp/doc/mda I'm not 100% satisfied with its API design (spanningIndices probably introduces an unacceptable bottleneck in many cases), though it does employ the principle of generalizing D array slices, so all instances of Arrays are "views" into some underlying data. Assignment, subarrays, etc., are therefore cheap and have reference semantics. Subarrays can be employed to allow block-initialization of arrays, etc.. A .dup method is provided to make actual copies of data when needed. There are some ugly hacks (IndexRange) in order to work around language limitations until Kenji's pull for implementing multidimensional slicing is committed. My code is also not as flexible as Denis' when it comes to index reordering, but it does work in the 2D case. Some of the unittests successfully demonstrate that the code is generic enough to handle arrays of non-numeric types, such as arrays of strings, in which case per-element unary/binary operations work as expected (e.g., A ~ "a" appends "a" to all elements, when A is an array of strings). [...]
First of all, it should provide two templates for matrices.  Let's
call them StaticMatrix and DynamicMatrix. The first one has
"templated" size and therefore may use static arrays and
compile-time checks. It can be useful when the size is determined by
our needs, for example, in graphics. DynamicMatrix has variable
size, i.e. it should be created in heap. It can be useful in all
other math areas.
I like this idea. Ideally, we should have many possible representations, but all conforming to a single API understood by all algorithms, so that you only have to write algorithms once, and they will work with any data structure. That's one key advantage of D, and we should make good use of it.
The problem is that algorithms still should know matrix template to provide compile-time checks if possible or throw exceptions at runtime if something gone wrong.
We should simply have standard APIs for checking for compile-time available properties, similar to std.range.hasLength. Then algorithms that care about the distinction can make use of it, and other algorithms that don't care will automatically work with both variants. [...]
But another question arises - which "majority" should we use in
interface? Interface should not depend on inner representation.  All
functions need unambiguity to avoid complication and repetition of
design. Well, actually we can deal with different majority in
interface - we can provide something like "asTransposed" adapter,
that will be applied by functions if needed, but then we will force
user to check majority of matrix interface, it's not very good
approach.
Algorithms shouldn't even care what majority the data representation is in. It should only access data via the standardized matrix API (whatever it is we decide on). The input type should be templated so that *any* type that conforms to this API will work. Of course, for performance-sensitive code, the user should be aware of which representations are best-performing, and make sure to pass in the appropriate type of representations; but we should not prematurely optimize here. Any linear algebra algorithms should be able to work with *any* type that conforms to a standard matrix API.
I'm not sure if you understand idea of differences between inner implementation majority and interface majority. I agree that inner majority should be defined by inner type. Interface majority is just choice between matrix[rowIndex, columnIndex] and matrix[columnIndex, rowIndex] In case of interface majority we just must choose the appropriate one and use it all over the library. It does not relate to performance.
Sorry, I misunderstood what you meant. I thought you were talking about storage order, but apparently you were talking about index order in the API, which is a different issue. My personal preference is columnIndex first, then rowIndex, but that is opposite of standard matrix notation in math (which is row-first then column). I think we should just choose one way or the other, and just stick with it consistently. The important thing is to be consistent to prevent bugs; exactly which order is the "right" way is just bikeshedding, I think. There's another related point here, related to declaring array dimensions. In my multi-dim array code, I chose to have the index order match the order of the specified array dimensions; e.g., if you declare: auto A = Array!(2,int)([2, 5]); then you'd index A from A[0,0] to A[1,4]. However, this is opposite of what nested arrays in D does: int[2][5] B; Here, the index range for B goes from B[0][0] to B[4][1], because B is a 5-element array of 2-element subarrays. Again, I'm not sure what the "correct" way is; I chose my way because it's easier to remember (the order you specify dimensions is the same order the indices appear in, you don't have to keep remembering when to reverse the order). But either way, a single standard approach should be chosen, and we should stick with it consistently. [...]
 Well, ok. We want to abstract from inner representation to provide
 freedom for users. We fall in metaprogramming and generic programming
 here, so we need to define concepts just like Boost/STL/std.range do.
 The good thing is that in D types with different interfaces and syntax
 constraints can satisfy same concept that would be impossible or very
 difficult in C++. Thanks to static if and is(typeof()). For example
 inner representation type can provide [][] operator or [,] operator
 and Matrix type will understand both cases.
 
 Suppose:
 
 template canBeMatrixRepresentation(T)
<bikeshed> What about a nicer, more concise name, like is2DArray? :) </bikeshed>
 {
     enum bool canBeMatrixRepresentation = is(typeof(
         {
             T t; //default constructable
             const(T) ct;
             alias ElementType!T E; //has ElementType
             E e; //element type is default constructable
             static if (/*has [,] operator*/)
             {
                 t[0,0] = e; //can be assigned
                 e = ct[0,0]; //can retrive element value from const(T)
             }
             else static if (/*has [][] operator*/)
             {
                 t[0][0] = e; //can be assigned
                 e = ct[0][0]; //can retrive element value from
 const(T)
             }
             else
             {
                 static assert(false);
             }
 
             size_t rows = ct.rowNum; //has row number
             size_t cols = ct.columnNum; //has column number
I'm uneasy about this one. This is too specific to 2D arrays, and would not work with general multidimensional arrays (that just happen to be 2D). I think it's better to use opDollar: size_t rows = ct.opDollar!0; // assume row-major indexing size_t cols = ct.opDollar!1; Requiring user types to implement opDollar has the advantage that then we can provide nice $ notation inside the slicing operator: auto topLeftElem = myMatrix[0,0]; auto bottomRightElem = myMatrix[$-1, $-1]; [...]
 We see that two-dimensional D array does not satisfy this concept
 because it has no rowNum and columnNum so it should be handled
 separately. This concept is not ideal since not all types may provide
 variable rowNum and columnNum. Also concept should expose information
 whether is "static" type or not, so algorithms will know can they use
 compile-time checks or not.
I think we should use opDollar instead of arbitrarily-named custom fields like rowNum or columnNum. Now, opDollar doesn't quite support built-in nested arrays, so I'm not sure how to fix that; perhaps provide some wrappers in std.array? It's kinda annoying, though, because nested arrays may be jagged, in which case it cannot be treated as a matrix. But rejecting all nested arrays outright seems a bit too limited. Maybe provide a wrapper to convert nested arrays into "real" 2D arrays (with runtime checking against jaggedness)? That also lets us check if a particular dimension is compile-time known or not: template isStaticDim(Array, int dim) { enum isStaticDim = is(typeof({ // If it's possible to put opDollar!dim into an // enum, then it's statically-known, otherwise // it's dynamic. enum length = Array.init.opDollar!dim; })); } StaticMatrix!(4,4) sm; // both dimensions known at compile-time DynamicMatrix dm(4,4); // both dimensions specified at runtime RowAppendableMatrix!4 am; // one dimension specified at runtime assert(isStaticDim!(typeof(sm), 0)); assert(isStaticDim!(typeof(sm), 1)); assert(!isStaticDim!(typeof(dm), 0)); assert(!isStaticDim!(typeof(dm), 1)); assert(isStaticDim!(typeof(am), 0)); assert(!isStaticDim!(typeof(am), 1));
 Types also can provide copy constructor. If they do then Matrix will
 use it, if they don't then Matrix will do element-by-element copy.
 It also can try .dup property.
I think it should be OK to standardize on ".dup", by analogy with built-in arrays. If no such method is provided, then do element-by-element copy.
 It's just example how it should work, but I hope the point is clear.
 We also need Matrix concept (or separate concepts for StaticMatrix
 and DynamicMatrix).
Looks like a hierarchy of concepts: Matrix / \ StaticMatrix DynamicMatrix The Matrix concept provides the common core APIs for accessing matrices; the more specific concepts StaticMatrix and DynamicMatrix build on top of Matrix by adding more API methods/properties specific to static or dynamic matrices. If an algorithm doesn't care which kind of matrix is passed to it, it can simply take Matrix (via an isMatrix signature constraint, ala std.algorithm.* specifying isInputRange, etc.). If it wants a specific kind of matrix, it specifies that instead (e.g. isStaticMatrix). If it can handle both, it can use static if to treat each case separately: auto myAlgo(M)(M matrix) if (isMatrix!M) // accept most general form of matrix { ... // code static if (isStaticMatrix!M) ... // stuff specific to static matrices else static if (isDynamicMatrix!M) ... // stuff specific to dynamic matrices else static assert(0); // oops ... // more code } T -- VI = Visual Irritation
Oct 11 2013
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
First thing. I see std.linarg I have no clue whatsoever what it 
even may be about. Do we really want to have a standard lib with 
names that look like base64 encoded ?
Oct 11 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Oct 12, 2013 at 12:36:08AM +0200, deadalnix wrote:
 First thing. I see std.linarg I have no clue whatsoever what it even
 may be about. Do we really want to have a standard lib with names
 that look like base64 encoded ?
I picked it up immediately. It's an obvious abbreviation of "linear algebra". I'm not sure I understand this resentment against abbreviated names. By your argument, we should rename std.xml to std.extensibleMarkupLanguage instead. Bring on the rainbow, the bikeshed is up for painting. T -- You are only young once, but you can stay immature indefinitely. -- azephrahel
Oct 11 2013
next sibling parent Justin Whear <justin economicmodeling.com> writes:
On Fri, 11 Oct 2013 15:42:17 -0700, H. S. Teoh wrote:

 On Sat, Oct 12, 2013 at 12:36:08AM +0200, deadalnix wrote:
 First thing. I see std.linarg I have no clue whatsoever what it even
 may be about. Do we really want to have a standard lib with names that
 look like base64 encoded ?
I picked it up immediately. It's an obvious abbreviation of "linear algebra". I'm not sure I understand this resentment against abbreviated names. By your argument, we should rename std.xml to std.extensibleMarkupLanguage instead. Bring on the rainbow, the bikeshed is up for painting. T
Don't be silly. It'd obviously have to be "standardLibrary.extensibleMarkupLanguage"
Oct 11 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/11/13 3:42 PM, H. S. Teoh wrote:
 On Sat, Oct 12, 2013 at 12:36:08AM +0200, deadalnix wrote:
 First thing. I see std.linarg I have no clue whatsoever what it even
 may be about. Do we really want to have a standard lib with names
 that look like base64 encoded ?
I picked it up immediately. It's an obvious abbreviation of "linear algebra".
Honestly I first thought it was "linear-time algorithms". Andrei
Oct 11 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Oct 11, 2013 at 04:13:41PM -0700, Andrei Alexandrescu wrote:
 On 10/11/13 3:42 PM, H. S. Teoh wrote:
On Sat, Oct 12, 2013 at 12:36:08AM +0200, deadalnix wrote:
First thing. I see std.linarg I have no clue whatsoever what it even
may be about. Do we really want to have a standard lib with names
that look like base64 encoded ?
I picked it up immediately. It's an obvious abbreviation of "linear algebra".
Honestly I first thought it was "linear-time algorithms".
[...] But I thought that was the province of std.algorithm? BTW, are we going to split up std.algorithm anytime soon, now that we have package.d? Jonathan is already working on splitting up std.datetime into more manageable pieces, and it would be nice if std.algorithm can be broken up into saner pieces too. (The last time I brought it up a few people supported it, but I haven't had the time to do anything about it so far.) T -- The easy way is the wrong way, and the hard way is the stupid way. Pick one.
Oct 11 2013
prev sibling parent reply "Andrej Mitrovic" <andrej.mitrovich gmail.com> writes:
On Friday, 11 October 2013 at 23:13:34 UTC, Andrei Alexandrescu 
wrote:
 Honestly I first thought it was "linear-time algorithms".

 Andrei
I thought it was some kind of algae species. Why not just call it std.algebra?
Oct 11 2013
parent "Brian Rogoff" <brogoff gmail.com> writes:
On Friday, 11 October 2013 at 23:24:00 UTC, Andrej Mitrovic wrote:
 On Friday, 11 October 2013 at 23:13:34 UTC, Andrei Alexandrescu 
 wrote:
 Honestly I first thought it was "linear-time algorithms".

 Andrei
I thought it was some kind of algae species. Why not just call it std.algebra?
Because, amongst people who know, numerical linear algebra algorithms form their own little world of computing apart from what may be called 'algebra'. 'Algebra' could mean any number of things, like symbolic algebra, but the entire world of MATLAB/R programmers know exactly what linalg might be. Numerical linear algebra. See the reference text by Golub and Van Loan. I would try to fit this module under some larger numeric or scientific computing hierarchy, but I don't see one in D yet. The Julia language is the competitor here, and it is already well ahead of D, but who knows how the race will end? -- Brian
Oct 11 2013
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 11 October 2013 at 22:36:09 UTC, deadalnix wrote:
 First thing. I see std.linarg I have no clue whatsoever what it 
 even may be about. Do we really want to have a standard lib 
 with names that look like base64 encoded ?
linalg is a pretty standard contraction of linear algebra. It's in common usage.
Oct 12 2013
prev sibling next sibling parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Friday, 11 October 2013 at 16:10:21 UTC, FreeSlave wrote:
 There is "Matrices and linear algebra" module in wish list. 
 Let's discuss its design. D is complicated language so it's 
 difficult to choose the right way here. We need to find 
 compromise between efficiency and convenient interface. I'm 
 going to make some suggestions how this module should look like.
 Please, make your suggestions too.
Firstly, my opinion is, it shouldn't be in the std package. Secondly, if it's not super highly performant, noone will use it. The highest performing linalg libraries implement the low level BLAS API or are vendor specific, like the Intel MKL. These are super optimized libraries where the asm code is hand optimized by specialists (GotoBLAS and the derived OpenBLAS for instance), and these will be *very* hard to beat. So much so that if BLAS isn't used as the engine, it's very unlikely that your library will be used for serious number crunching. Built above BLAS, you can find a number of libraries that make using linear algebra more pleasant. One such library, currently the best IMHO is Armadillo C++. It has the very big advantage that it mimics Matlab functions. This feature is extremely useful, because scientists very often model their algorithms in Matlab before porting them to C++. This is what is done in the company I work for. So the best I could suggest is to study the Armadillo C++ library and port it to D.
Oct 11 2013
parent reply "FreeSlave" <freeslave93 gmail.com> writes:
On Saturday, 12 October 2013 at 05:20:11 UTC, SomeDude wrote:
 On Friday, 11 October 2013 at 16:10:21 UTC, FreeSlave wrote:
 There is "Matrices and linear algebra" module in wish list. 
 Let's discuss its design. D is complicated language so it's 
 difficult to choose the right way here. We need to find 
 compromise between efficiency and convenient interface. I'm 
 going to make some suggestions how this module should look 
 like.
 Please, make your suggestions too.
Firstly, my opinion is, it shouldn't be in the std package. Secondly, if it's not super highly performant, noone will use it. The highest performing linalg libraries implement the low level BLAS API or are vendor specific, like the Intel MKL. These are super optimized libraries where the asm code is hand optimized by specialists (GotoBLAS and the derived OpenBLAS for instance), and these will be *very* hard to beat. So much so that if BLAS isn't used as the engine, it's very unlikely that your library will be used for serious number crunching. Built above BLAS, you can find a number of libraries that make using linear algebra more pleasant. One such library, currently the best IMHO is Armadillo C++. It has the very big advantage that it mimics Matlab functions. This feature is extremely useful, because scientists very often model their algorithms in Matlab before porting them to C++. This is what is done in the company I work for. So the best I could suggest is to study the Armadillo C++ library and port it to D.
For these cases we may let users to choose low-level backend if they need. High-level interface and default implementation are needed anyway. I called it std.linalg because there is website http://www.linalg.org/ about C++ library for exact computational linear algebra. Also SciD has module scid.linalg. We can use std.linearalgebra or something else. Names are not really important now. Ok, things are more clear now. Today I look what I can do.
Oct 11 2013
parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Saturday, 12 October 2013 at 06:24:58 UTC, FreeSlave wrote:
 For these cases we may let users to choose low-level backend if 
 they need. High-level interface and default implementation are 
 needed anyway.

 I called it std.linalg because there is website 
 http://www.linalg.org/ about C++ library for exact 
 computational linear algebra. Also SciD has module scid.linalg. 
 We can use std.linearalgebra or something else. Names are not 
 really important now.

 Ok, things are more clear now. Today I look what I can do.
There are litterally dozens of linear algebra packages: Eigen, Armadillo, Blitz++, IT++, etc. I was not complaining about the linalg name, but about the fact that you want to make it a std subpackage. I contend that if you want to make it a std package, it must be nearly perfect, i.e better performing than ALL the other alternatives, even the C++ ones, and that it's really good as an API. Else it will be deprecated because someone will have made a better alternative. Given the number of past tries, I consider this project is very likely doomed to failure. So no std please.
Oct 12 2013
next sibling parent "FreeSlave" <freeslave93 gmail.com> writes:
On Saturday, 12 October 2013 at 08:47:33 UTC, SomeDude wrote:
 On Saturday, 12 October 2013 at 06:24:58 UTC, FreeSlave wrote:
 For these cases we may let users to choose low-level backend 
 if they need. High-level interface and default implementation 
 are needed anyway.

 I called it std.linalg because there is website 
 http://www.linalg.org/ about C++ library for exact 
 computational linear algebra. Also SciD has module 
 scid.linalg. We can use std.linearalgebra or something else. 
 Names are not really important now.

 Ok, things are more clear now. Today I look what I can do.
There are litterally dozens of linear algebra packages: Eigen, Armadillo, Blitz++, IT++, etc. I was not complaining about the linalg name, but about the fact that you want to make it a std subpackage. I contend that if you want to make it a std package, it must be nearly perfect, i.e better performing than ALL the other alternatives, even the C++ ones, and that it's really good as an API. Else it will be deprecated because someone will have made a better alternative. Given the number of past tries, I consider this project is very likely doomed to failure. So no std please.
It's not my idea to include this kind of module into std. I found it in wish list here http://wiki.dlang.org/Wish_list so I supposed community needs it and it should be discussed and designed, and then implemented by someone, not necessarily by me, though I can and will try to do it by myself. You're right, we should to learn existing solutions firstly and then make the best in D.
Oct 12 2013
prev sibling parent "CJS" <Prometheus85 hotmail.com> writes:
I'm with SomeDude: This project is doomed to failure from the get 
go, and under no circumstances should be put into std.

First, some background so this doesn't seem too harsh. Numerical 
linear algebra is easily the single oldest and most 
well-developed area of computing. Period. There's a huge history 
there, and if you don't know it then whatever you make won't be 
used and will be universally derided. Second, beyond the 
background of implementations is the theoretical background of 
what they're used for and how that influences the API.

You also have understand that linear algebra algorithms and data 
structures come in 3 major varieties: Dense, Sparse Direct, and 
Iterative methods. For just the theoretical background on these 
and associated floating point issues there is an absolute minimum 
of the material similar to what's covered in these two standard 
references:
Matrix Computations by Golub and Van Loan: 
http://www.amazon.com/Computations-Hopkins-Mathematical-Sciences-ebook/dp/B00BD2DVIC/ref=sr_1_6?ie=UTF8&qid=1381614419&sr=8-6&keywords=nick+higham

Accuracy and Stability of Numerical Algorithms by Nick Higham: 
http://www.amazon.com/Accuracy-Stability-Numerical-Algorithms-Nicholas/dp/0898715210/ref=sr_1_7?ie=UTF8&qid=1381614419&sr=8-7&keywords=nick+higham

The old, well-tested, and venerable BLAS and LAPACK APIs/packages 
are just for dense matrices. There are other variants for 
different kinds of parallelism. And notice there are differences 
in numerical linear algebra between instruction level parallelism 
(like SSE and AVX), shared memory, distributed memory, and 
coprocessors (like OpenCL or CUDA). In terms of popular existing 
implementations there are more recent packages like Magma 
(http://icl.utk.edu/magma/), 
PLASMA(http://icl.cs.utk.edu/plasma/), PETSc 
(http://www.mcs.anl.gov/petsc/), Trillinos 
(http://trilinos.sandia.gov/), or Elemental 
(http://libelemental.org/). The most popular sparse direct 
package, as far as I know, is SuiteSparse 
(http://www.cise.ufl.edu/research/sparse/SuiteSparse/), which is 
wrapped by Eigen 
(http://eigen.tuxfamily.org/index.php?title=Main_Page). Eigen is 
so popular in part because it does a good job of balancing 
expression template magic for dense matrices and wrapping/calling 
BLAS/LAPACK and SuiteSparse.

The very popular NumPy and SciPy packages are also basically 
wrappers around BLAS and LAPACK. There is also a scipy.sparse 
package, but its API differences are an ongoing headache.

All of which is to say this is a very deep area, the differences 
in algorithms and storage are sufficiently fundamental to make a 
unified API very difficult, and in any case the D community 
simply cannot commit to developing and maintaining such a library 
or commit to rolling in future developments from ongoing research 
in the area. This is exactly the sort of thing that should be a 
third-party extension, like in every other language, rather than 
in the standard library.


On Saturday, 12 October 2013 at 08:47:33 UTC, SomeDude wrote:
 On Saturday, 12 October 2013 at 06:24:58 UTC, FreeSlave wrote:
 For these cases we may let users to choose low-level backend 
 if they need. High-level interface and default implementation 
 are needed anyway.

 I called it std.linalg because there is website 
 http://www.linalg.org/ about C++ library for exact 
 computational linear algebra. Also SciD has module 
 scid.linalg. We can use std.linearalgebra or something else. 
 Names are not really important now.

 Ok, things are more clear now. Today I look what I can do.
There are litterally dozens of linear algebra packages: Eigen, Armadillo, Blitz++, IT++, etc. I was not complaining about the linalg name, but about the fact that you want to make it a std subpackage. I contend that if you want to make it a std package, it must be nearly perfect, i.e better performing than ALL the other alternatives, even the C++ ones, and that it's really good as an API. Else it will be deprecated because someone will have made a better alternative. Given the number of past tries, I consider this project is very likely doomed to failure. So no std please.
Oct 12 2013
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 11 October 2013 at 16:10:21 UTC, FreeSlave wrote:
 There is "Matrices and linear algebra" module in wish list. 
 Let's discuss its design. D is complicated language so it's 
 difficult to choose the right way here. We need to find 
 compromise between efficiency and convenient interface. I'm 
 going to make some suggestions how this module should look like.

 First of all, it should provide two templates for matrices. 
 Let's call them StaticMatrix and DynamicMatrix. The first one 
 has "templated" size and therefore may use static arrays and 
 compile-time checks. It can be useful when the size is 
 determined by our needs, for example, in graphics. 
 DynamicMatrix has variable size, i.e. it should be created in 
 heap. It can be useful in all other math areas.

 Both templates should support all floating point types and 
 moreover user-defined (for example wrappers for GMP library and 
 others).

 For efficiency in both cases matrices should use 
 one-dimensional array for inner representation. But actually 
 I'm not sure if matrices should support other container types 
 besides standard D arrays. The good thing about one-dimensional 
 arrays is that they can be easily exposed to foreign functions, 
 for example, to C libraries and OpenGL. So we should take care 
 about memory layout - at least row-major and column-major. I 
 think it can be templated too.

 But another question arises - which "majority" should we use in 
 interface? Interface should not depend on inner representation. 
 All functions need unambiguity to avoid complication and 
 repetition of design. Well, actually we can deal with different 
 majority in interface - we can provide something like 
 "asTransposed" adapter, that will be applied by functions if 
 needed, but then we will force user to check majority of matrix 
 interface, it's not very good approach.

 Sometimes user takes data from some other source and wants to 
 avoid copying in Matrix construction, but she also wants to get 
 matrix functionality. So we should provide "arrayAsMatrix" 
 adapter, that can adopt one-dimensional and two-dimensional 
 arrays making them feel like matrices. It definitely should not 
 make copy of dynamic array, but I'm not sure about static.

 About operation overloading. It's quite clear about 'add' and 
 'subtruct' operations, but what's about product? Here I think 
 all 'op'-functions should be 'element by element' operations. 
 So we can use all other operations too without ambiguity. For 
 actual matrix multiplication it can provide 'multiply' or 
 'product' function. It's similar to Maxima approach, besides 
 Maxima uses dot notation for these needs.

 Transposition. I've already mentioned "asTransposed" adapter. 
 It should be useful to make matrix feel like transposed without 
 its copying. We also can implement 'transpose' and 'transposed' 
 functions. The first one transposes matrix in place. It's 
 actually not allowed for non-square StaticMatrix since we can't 
 change the size of this type of matrices at runtime. The second 
 one returns copy so it's applicable in all cases. Actually I'm 
 not sure should these functions be member functions or not.

 Invertible matrix. It must not be allowed for square 
 StaticMatrix. DynamicMatrix may make checks at runtime. Same as 
 for transposition we can implement 'invert' to invert in place 
 and 'inverted' to make copy. There is issue here - what should 
 we do when determinant is 0? I believe the best approach here 
 is to throw exception since if user needs invertible matrix 
 it's definitely exception when it can't be calculated.

 Please, make your suggestions too.
I'd like to echo some of the points made by CJS. Rolling your own linear algebra (for anything other than very small matrices) is an exercise in futility unless you are prepared to become a world-class expert in what is an incredibly mature and detailed field. Clever wrapping of industry standard libraries is the way to go. I would love to see this https://github.com/cristicbz/scid taken further, which was a fork of scid that used clever magic to reduce matrix expressions to an semi-optimal set of blas calls. It doesn't appear to have got much love in recent times.
Oct 12 2013