STRUMPACK
STRUMPACK copied to clipboard
Does STRUMPACK support sparse LDLT factorization for symmetric matrix?
From the Sparse Direct Solver section, it seems to use the general sparse LU factorization (when no compression is used). Is there a LDLT implementation for symmetric matrix (which would save ~half of compute&memory)? Thanks!
Hi, No, unfortunately we don't (yet) support LDLT.
Hi @pghysels I'd like to better understand the steps to develop symmetric LDLT from the existing general LU. Thanks!
From the book Multifrontal Methods Parallelism, Memory Usage and Numerical Aspects (written by MUMPS's author; MUMPS has BLR LU and LDLT implementations), there should be those differences at the abstract algorithm level:
- The eliminate tree structure should be simpler: in the symmetric case, the transitive reduction of the graph of explicit dependencies is the symmetric elimination tree (Chapter 1.2.3 Elimination graph structures)
- Need 2x2 pivot to maintain the symmetry of the frontal matrices (Chapter 1.3.2 Numerical accuracy and pivoting)
- The call to BLAS kernel is different (Chapter 2.1.4 Factorization of type 1 nodes)
The above are for full-rank exact factorization. Regarding the low-rank compression of frontal matrices, I assume that the compression algorithm (RRQR) should stay exact the same (symmetric vs general)?
What I am not sure is how those textbook modifications map to Strumpack's code structure --
- I would naively assume that the low-rank compression code (
src/sparse/fronts/FrontalMatrix*.cpp) should stay the same for symmetric and nonsymmetric matrices? - The major changes should be maded in
src/sparse/EliminationTree(MPI).hppand the matrix container (src/sparse/CSRMatrix(MPI).cpp), I think? - I notice a
symm_sparsebool attribute defined inCSRMatrixandCompressedSparseMatrix, but don't see how this tag is used during factorization
STRUMPACK uses a symmetric pattern, so no changes would be needed in the symbolic phase.
Since we use a symmetric pattern, we need to symmetrize the pattern before passing it to say METIS, or before doing the symbolic phase. The user can specify if the pattern is already symmetric with the symm_sparse input. But this doesn't really save much.
I think the main changes would just be to replace _getrf with _sytrf, and then replace the _trsm calls (I'm not sure by what). See here and the lines below that.
Then the forward/backward solve needs to be updated accordingly. The code for distributed memory (calling ScaLAPACK p_getrf) needs to modified, as well as the GPU code.
Keeping the low rank algorithms symmetric will be more complicated I'm afraid.
There is also the extend-add phase which passes the Schur complement (22 block of the front) to the parent in the tree. This phase now assumes we have the whole front, and not just the lower/upper triangular part.
Thanks for the explanation!
Keeping the low rank algorithms symmetric will be more complicated I'm afraid.
Could you elaborate more on this point?
From the MUMPS BLR paper, the only few comments on the symmetric BLR factorization are:
- "Note that Algorithm 1 (Dense BLR LU factorization) can be easily adapted to the symmetric (definite and indefinite) case, where operations on U are not performed in order to take advantage of the symmetry."
- "The flexibility of the BLR format makes it compatible with standard threshold partial pivoting, ... In the case of symmetric matrices, only diagonal pivoting is allowed in order to maintain symmetry, so that pivots must be part of AI,I ; two-by-two pivots are used when necessary."
At least for the BLR case, the modification looks minor? For the hierachical compressions involving tree structures (HSS, HODLR), the changes will be more complicated indeed.
then replace the
_trsmcalls (I'm not sure by what).
I think trsm is the only LAPACK triangular solve routine -- the triangular factors should not be aware of the symmetry of the original matrix.
The one for LDLT should be sytrs, which works for both diagonal or 2x2 block diagonal D
The problem is that dsytrf does not return the triangular factor. So we cannot use trsm directly. sytrs solves a system, so it solves with both the triangular factor and it's transpose.