Implement packed format for symmetric and triangular matrices
When working with symmetric, upper-triangular, and lower-triangular matrices in LinearAlgebra, it is necessary to allocate $N^2$ numbers, but there are only at most $N(N+1)/2$ distinct entries. This is in contrast to common Fortran linear algebra packages.
Consider the following code:
using LinearAlgebra;
N = 5;
x = zeros(N,N);
z = 0.;
for j in 1:N
for i in 1:j
z+=1;
x[i,j] = z;
end
end
xut = UpperTriangular(x);
xsym = Symmetric(x);
x === xut.data # returns true
x === xysm.data # returns true
For reference, x looks like:
5×5 Matrix{Float64}:
1.0 2.0 4.0 7.0 11.0
0.0 3.0 5.0 8.0 12.0
0.0 0.0 6.0 9.0 13.0
0.0 0.0 0.0 10.0 14.0
0.0 0.0 0.0 0.0 15.0
One can see that UpperTriangular and Symmetric are just wrapping the memory allocated in x which means that if you don't actually care about the lower-triangular part of x, you're unnecessarily storing $N(N-1)/2 = O(N^2/2)$ zeroes! The purpose of these wrappers is so that when some routine from LinearAlgebra is called, the appropriate method optimized for the matrix type is called. In BLAS, LAPACK, and CuBLAS, there are equivalent routines which support a "packed" format which essentially stores the non-zero/necessary entries as a vector in column-major-like form. In this packed format, xut.data would be stored as [1.0, 2.0, 3.0, ..., 15.0]. Here are some links to the documentation where some of these functions and features are discussed:
- https://www.netlib.org/lapack/lug/node123.html#subsecpacked
- https://www.netlib.org/lapack/explore-html/d7/d15/group__double__blas__level2_ga1d9a8ecfddfea2c84e73e28e1ebb74cf.html
- https://www.netlib.org/lapack/explore-html/d7/d15/group__double__blas__level2_ga0fff73e765a7655a67da779c898863f1.html
- https://docs.nvidia.com/cuda/cublas/index.html#cublas-t-tpsv
I think exposing these routines in LinearAlgebra and CUDA.jl would be useful and improve memory efficiency by a factor of two, approximately. I imagine there might be some new types, like LowerTriangularPacked, UpperTriangularPacked, SymmetricPacked, whose data fields would be the packed format used in these linear algebra routines. Some implementations that might be nice besides the access to the BLAS, LAPACK, and CuBlas routines could be:
- A constructor that takes a
Matrix-typed variable likexand converts it to a packed-type variable by extracting and flattening the appropriate entries - A constructor that directly takes a
Vector-typed variable in a packed format - Appropriate indexing: provide a row-column index like
xut[i,j]and access the appropriate linear index in the packed format - Iterate over relevant indices in an efficient way
- A method to convert these packed types back to dense matrices
I would greatly appreciate it if this were implemented, and I believe it would improve Julia as a whole. Thanks!
Does https://github.com/JuliaLinearAlgebra/RectangularFullPacked.jl implement what you want?
I believe @bjarthur was working on vector-based packed format linear algebra; I found https://github.com/JaneliaSciComp/SymmetricFormats.jl. I remember that we do have some LAPACK functions available in LinearAlgebra (#34211, JuliaLang/julia#34320), but we don't have wrapper types.
i was indeed, and yes you found it! please file an issue there if you have problems or suggestions
This is nice. Thank you so much for sharing. I initially became interested in this because I am memory-limited on a GPU for a project, so ideally this would work on the GPU with CUDA.jl. I also would enjoy seeing this implemented in the default LinearAlgebra with appropriate method and in a packed format with pretty-printing, but I understand that this might not happen.
Thanks again!