mathnet-numerics
mathnet-numerics copied to clipboard
For consistency Linear Algebra should handle the Math.Net Matrix<T> and Vector<T>
Most high performance using BLAS/LAPACK when ported from other architectures or languages are at disadvantage on MathNet.
Dealing with the nuances of which memory layout is selected for the storage should not be an issue. Today we have to extract the data in array format and pass the appropriate parameters to the providers when Matrix<T> and Vector<T> already knows them.
For example: void AddVectorToScaledVector(T[] y, T alpha, T[] x, T[] result); should have a partner: void AddVectorToScaledVector(Vector<T> y, T alpha, Vector<T> x, Vector<T> result );
The same for Matrix<T> GEMM and similar.
Just to clarify, this is about using the linear algebra providers directly from user code?
Having the providers operate on Matrix/Vector or maybe MatrixStorage/VectorStorage instances directly instead of arrays would be a major change, but would indeed have a few advantages:
- Allow much more efficient GPU providers
- Avoid combinatorial API explosion once we finally start to map sparse native algorithms
- Get rid of some duplicate logic in the type-specific matrix classes, these matrix classes would become quite a bit simpler. Much cleaner separation, algorithms would always live in the providers (currently it's a bit mixed with quite a few algorithms are directly in the matrix classes).
The providers would become more complicated though, which could be a disadvantage as we want to add at least one more with a liberal license (e.g. Magma).
Ah, I've just noticed that AddVectorToScaledVector
is not actually exposed through vectors, so this is also related to #245.
I was actually thinking about those lines, but even if that is not the case, many of the BLAS/LAPACK operations are not exposed through the Matrix/Vector. They are useful because we pay the memory access for complex operations like = a * M(A) * M(B) + b * M(C)
With the Matrix API that involves several operations, therefore accessing and kiling the cache line twice. And given that the native providers already have those operations the only chance there is to go full native using float/double arrays.
Have you considered using some T4 generated adaptation layer where code is highly repetitive?
We did use T4 originally a few years back, but then got rid of it (If I remember correctly it didn't work that well in practice and actually made things more complicated).