ndarray-linalg
ndarray-linalg copied to clipboard
Pivoted Cholesky
Currently, Cholesky will fail on PSD but not PD matrices, because it calls ?pptrf
.
use ndarray::{array, Array, Axis};
use ndarray_linalg::cholesky::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let x1 = array![1., 2., 3., 4., 7.];
let x2 = array![-1., 2., 3., 5., 8.];
let x3 = array![1., -2., 3., 6., 9.];
let x4 = &x1 * 1. + &x2 * 2. - &x3 * 3.;
let xs = [x1, x2, x3, x4];
let xs = xs
.iter()
.map(|x| x.view().insert_axis(Axis(1)))
.collect::<Vec<_>>();
let X = ndarray::stack(Axis(1), &xs)?;
println!("{:?}", X);
let XTX = X.t().dot(&X);
println!("{:?}", XTX);
let chol = XTX.factorizec(UPLO::Lower)?; // Error: Lapack { return_code: 4 }
Ok(())
}
However, if we allow pivoting, then we can return a cholesky factor U
, pivot matrix P
such that P U^T U P^T = A
for an input matrix A
that's merely PSD (this also returns the rank r
).
It would be nice if ndarray-linalg could also provide this pivoted version, e.g., as shown here in python:
from scipy.linalg.lapack import dpstrf
import numpy as np
xt = np.array([
[1, 2, 3, 4, 7],
[-1, 2, 3, 5, 8],
[1, -2, 3, 6, 9]]).astype(float)
x = np.insert(xt, len(xt), xt[0] + 2 * xt[1] - 3 * xt[2], axis=0).T
xtx = x.T.dot(x)
U, P, rank, info = dpstrf(xtx)
assert info > 0
assert rank == 3
P -= 1
U = np.triu(U)
invP = np.empty_like(P)
invP[P] = np.arange(len(P), dtype=int)
print(np.linalg.norm(U.T.dot(U)[np.ix_(invP, invP)] - xtx, ord='fro')) # row indexing inverts permutations
# 3.552713678800501e-14
An interesting design question would be what the interface should be. Clearly this routine should return the factor, pivot, and rank in some form. But it'd be nice if I could take my pivoted Cholesky output, truncate U
to its leading principal minor of order r
, and initialize a CholeskyFactorized
struct directly, so that I can just re-use existing methods for solving the reduced subsystem.
There is solveh submodule for PD matrices which calls [sd]sytr[fsi]
and [cz]hetr[fsi]
.
Awesome, thanks for pointing me to it! For anyone else coming in to this issue, to save you some sleuthing I needed to do:
?sytrf
/?hetrf
perform the Bunch–Kaufman factorization. Sec 4.4 in Matrix Computations and Chapter 11 of Higham describe these and other methods for solving symmetric indefinite systems.
There's basically LDL^T
decomposition where D
is block-diagonal (for which Bunch–Kaufman provides a partial pivoting strategy, rook pivoting is an "intermediate" pivoting strategy, and Bunch and Parlett give complete pivoting).
Interestingly, a recent LAPACK working note discusses another approach based on Aasen's method, where empirical results for stability generally favor it (but it does not dominate, see Figure 8). Aasen's method decomposes into LTL^T
where T
is symmetric tridiagonal.
Note that these indefinite decompositions are solving more general problems than the original issue points to, which is just looking at positive semidefinite systems. I'll try the indefinite approach in my use case.
@termoshtt I couldn't find anything that compared the stability of these methods when applied to positive semidefinite systems, where in principle ?pstrf
seems like the right tool for the job. Do you have a sense of how the stability would compare?
The solveh
as it stands does not let users work with PSD matrices in a reliable fashion.
Example:
use ndarray::{Axis, array, arr1};
use ndarray_linalg::{FactorizeHInto, SolveH};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let x1 = array![1., 2., 3., 4., 7.] / 10.; // Remove this division and factorization will not return an error, thanks to numerical imprecisions.
let x2 = array![-1., 2., 3., 5., 8.] / 10.;
let x3 = 2. * &x2;
let xs = [x1, x2, x3];
let xs = xs
.iter()
.map(|x| x.view().insert_axis(Axis(1)))
.collect::<Vec<_>>();
let X = ndarray::stack(Axis(1), &xs)?;
assert_eq!(X.shape(), &[5, 3]);
let XTX = X.t().dot(&X);
println!("{:?}", X);
println!("{:?}", XTX);
let f = XTX.factorizeh_into()?; // Error: Lapack { return_code: 2 }
}
The problem:
- x_2 and x_3 are colinear.
- The second diagonal entry for D in the factorization X^T X = U * D * U^T is exactly zero.
- LAPACK spots it and returns an INFO=2. Note that the factorization completed successfully at this stage (see docs).
- The
solveh
code as it is treats any INFO =/= 0 as an error and returnsErr(LinalgError::Lapack { return_code })
In my opinion, cases where INFO > 0 for the solveh
factorization should not be treated as an error, as it is expected to happen for some PSD inputs, and the factorization in those cases is still successful.
Remark: the error disappears if I remove the "/ 10." above, because then the second diagonal entry for D is only almost 0, due to numerical imprecisions : )