ndarray-linalg icon indicating copy to clipboard operation
ndarray-linalg copied to clipboard

Pivoted Cholesky

Open vlad17 opened this issue 4 years ago • 3 comments

Currently, Cholesky will fail on PSD but not PD matrices, because it calls ?pptrf.

use ndarray::{array, Array, Axis};
use ndarray_linalg::cholesky::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let x1 = array![1., 2., 3., 4., 7.];
    let x2 = array![-1., 2., 3., 5., 8.];
    let x3 = array![1., -2., 3., 6., 9.];
    let x4 = &x1 * 1. + &x2 * 2. - &x3 * 3.;
    let xs = [x1, x2, x3, x4];
    let xs = xs
        .iter()
        .map(|x| x.view().insert_axis(Axis(1)))
        .collect::<Vec<_>>();
    let X = ndarray::stack(Axis(1), &xs)?;
    println!("{:?}", X);
    let XTX = X.t().dot(&X);
    println!("{:?}", XTX);
    let chol = XTX.factorizec(UPLO::Lower)?; // Error: Lapack { return_code: 4 }
    Ok(())
}

However, if we allow pivoting, then we can return a cholesky factor U, pivot matrix P such that P U^T U P^T = A for an input matrix A that's merely PSD (this also returns the rank r).

It would be nice if ndarray-linalg could also provide this pivoted version, e.g., as shown here in python:

from scipy.linalg.lapack import dpstrf
import numpy as np
xt = np.array([
[1, 2, 3, 4, 7],
[-1, 2, 3, 5, 8],
[1, -2, 3, 6, 9]]).astype(float)
x = np.insert(xt, len(xt), xt[0] + 2 * xt[1] - 3 * xt[2], axis=0).T
xtx = x.T.dot(x)
U, P, rank, info = dpstrf(xtx)
assert info > 0
assert rank == 3
P -= 1
U = np.triu(U)
invP = np.empty_like(P)
invP[P] = np.arange(len(P), dtype=int)
print(np.linalg.norm(U.T.dot(U)[np.ix_(invP, invP)] - xtx, ord='fro')) # row indexing inverts permutations
# 3.552713678800501e-14

An interesting design question would be what the interface should be. Clearly this routine should return the factor, pivot, and rank in some form. But it'd be nice if I could take my pivoted Cholesky output, truncate U to its leading principal minor of order r, and initialize a CholeskyFactorized struct directly, so that I can just re-use existing methods for solving the reduced subsystem.

vlad17 avatar Sep 01 '20 00:09 vlad17

There is solveh submodule for PD matrices which calls [sd]sytr[fsi] and [cz]hetr[fsi].

termoshtt avatar Sep 05 '20 08:09 termoshtt

Awesome, thanks for pointing me to it! For anyone else coming in to this issue, to save you some sleuthing I needed to do:

?sytrf/?hetrf perform the Bunch–Kaufman factorization. Sec 4.4 in Matrix Computations and Chapter 11 of Higham describe these and other methods for solving symmetric indefinite systems.

There's basically LDL^T decomposition where D is block-diagonal (for which Bunch–Kaufman provides a partial pivoting strategy, rook pivoting is an "intermediate" pivoting strategy, and Bunch and Parlett give complete pivoting).

Interestingly, a recent LAPACK working note discusses another approach based on Aasen's method, where empirical results for stability generally favor it (but it does not dominate, see Figure 8). Aasen's method decomposes into LTL^T where T is symmetric tridiagonal.

Note that these indefinite decompositions are solving more general problems than the original issue points to, which is just looking at positive semidefinite systems. I'll try the indefinite approach in my use case.

@termoshtt I couldn't find anything that compared the stability of these methods when applied to positive semidefinite systems, where in principle ?pstrf seems like the right tool for the job. Do you have a sense of how the stability would compare?

vlad17 avatar Sep 05 '20 20:09 vlad17

The solveh as it stands does not let users work with PSD matrices in a reliable fashion.

Example:

use ndarray::{Axis, array, arr1};
use ndarray_linalg::{FactorizeHInto, SolveH};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let x1 = array![1., 2., 3., 4., 7.] / 10.; // Remove this division and factorization will not return an error, thanks to numerical imprecisions.
    let x2 = array![-1., 2., 3., 5., 8.] / 10.;
    let x3 = 2. * &x2;
    let xs = [x1, x2, x3];
    let xs = xs
        .iter()
        .map(|x| x.view().insert_axis(Axis(1)))
        .collect::<Vec<_>>();
    let X = ndarray::stack(Axis(1), &xs)?;
    assert_eq!(X.shape(), &[5, 3]);
    let XTX = X.t().dot(&X);
    println!("{:?}", X);
    println!("{:?}", XTX);
    let f = XTX.factorizeh_into()?; // Error: Lapack { return_code: 2 }
}

The problem:

  1. x_2 and x_3 are colinear.
  2. The second diagonal entry for D in the factorization X^T X = U * D * U^T is exactly zero.
  3. LAPACK spots it and returns an INFO=2. Note that the factorization completed successfully at this stage (see docs).
  4. The solveh code as it is treats any INFO =/= 0 as an error and returns Err(LinalgError::Lapack { return_code })

In my opinion, cases where INFO > 0 for the solveh factorization should not be treated as an error, as it is expected to happen for some PSD inputs, and the factorization in those cases is still successful.

Remark: the error disappears if I remove the "/ 10." above, because then the second diagonal entry for D is only almost 0, due to numerical imprecisions : )

Arnaud15 avatar Sep 15 '20 19:09 Arnaud15