linfa-linalg icon indicating copy to clipboard operation
linfa-linalg copied to clipboard

Integration testing for LOBPCG

Open YuhanLiin opened this issue 3 years ago • 11 comments

I saw some interesting failures while property testing LOBPCG. The failure cases are covered in the tests problematic_eig_matrix and problematic_svd_matrix. The eig test straight-up yields the wrong eigenvalues and the SVD test yields the correct singular values but without the last value. @bytesnake any ideas on what's wrong?

YuhanLiin avatar May 25 '22 05:05 YuhanLiin

Hi YuhanLii, LOBPCG should not be used for full decompositions, the python source bails out when you want more than 1/5 of the eigenvectors/vals. https://github.com/scipy/scipy/blob/v1.8.1/scipy/sparse/linalg/_eigen/lobpcg/lobpcg.py#L339 I think it should be relatively easy to find problems where this fails. The missing eigenvalue should not occur though and requires more debugging. Don't have time today, but can give it a shot tomorrow

bytesnake avatar May 25 '22 06:05 bytesnake

Does the algorithm also fail if you compute the full decomp one eigenvalue at a time using the iterator? Does it also apply to SVD (meaning that you can't use it to compute full SVD)? Also, we should add the check from the Python code to our impl as well, and return an error if the user requests too many eigvals.

For the missing value, I think it may be because the rank of the matrix is lower than the number of eigvals requested, in which case the behaviour is correct.

YuhanLiin avatar May 25 '22 16:05 YuhanLiin

For the missing value, I think it may be because the rank of the matrix is lower than the number of eigvals requested, in which case the behaviour is correct.

yes that would be an explanation

Does the algorithm also fail if you compute the full decomp one eigenvalue at a time using the iterator? Does it also apply to SVD (meaning that you can't use it to compute full SVD)? Also, we should add the check from the Python code to our impl as well, and return an error if the user requests too many eigvals.

yes it applies to any solver derived from LOBPCG, I would add brakes to SVD and eigenvalue decomposition. The question is how many, I observed in application that 1/5 is a conservative bound but for most usecases sufficient

bytesnake avatar May 25 '22 16:05 bytesnake

Actually it can't be due to the rank because none of the single values of the input matrix in problematic_svd_matrix are 0

Also, the Marchenko-Pastur test uses a 1000x500 matrix but obtains 500 singular values. Doesn't that violate the 1/5 rule?

YuhanLiin avatar May 25 '22 18:05 YuhanLiin

Actually it can't be due to the rank because none of the single values of the input matrix in problematic_svd_matrix are 0

Also, the Marchenko-Pastur test uses a 1000x500 matrix but obtains 500 singular values. Doesn't that violate the 1/5 rule?

Yes, of course it does.

The 1/5 rule is there not only to prevent lobpcg possible failures, but also to enforce using a standard dense solver in situations where it is likely much more efficient compared to lobpcg.

lobpcg avatar Jun 29 '22 20:06 lobpcg

sorry for the long silence, just to add an explanation: I have never seen any failure for the Marchenko-Pastur distributed random matrices and therefore kept comparing the whole spectrum instead of first fifth.

bytesnake avatar Aug 01 '22 08:08 bytesnake

Then maybe we can have the internal algorithm solve the whole spectrum for testing but have the public API limit it to 1/5th. Or just change the Marchenko-Pastur test to do 1/5th so it runs faster.

YuhanLiin avatar Aug 07 '22 02:08 YuhanLiin

Then maybe we can have the internal algorithm solve the whole spectrum for testing but have the public API limit it to 1/5th. Or just change the Marchenko-Pastur test to do 1/5th so it runs faster.

here is a proposal https://github.com/rust-ml/linfa-linalg/commit/a780910fd5638b134ee5db5ba03198a08a07fea7 add branch for dense eigenproblem solver for TruncatedSvd and TruncatedEig

bytesnake avatar Aug 07 '22 19:08 bytesnake

@bytesnake I've implemented your change and it works on the previous failing cases. However, I'm getting another failing matrix on the truncated eigenvalues, which I've put into the problematic_eig test. The issue is that the eigenvector is wrong, even when extracting just 1 eigenvalue.

YuhanLiin avatar Aug 10 '22 04:08 YuhanLiin

Update: When I run the linfa PCA tests against this branch, I get the following failure:

thread 'pca::tests::test_explained_variance_diag' panicked at 'ndarray: inputs 4 × 4 and 3 × 3 are not compatible for matrix multiplication', /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:299:5
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:498:5
   1: core::panicking::panic_fmt
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:116:14
   2: ndarray::linalg::impl_linalg::dot_shape_error
             at /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:299:5
   3: <ndarray::ArrayBase<S,ndarray::dimension::dim::Dim<[usize; 2]>> as ndarray::linalg::impl_linalg::Dot<ndarray::ArrayBase<S2,ndarray::dimension::dim::Dim<[usize; 2]>>>>::dot
             at /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:273:13
   4: ndarray::linalg::impl_linalg::<impl ndarray::ArrayBase<S,ndarray::dimension::dim::Dim<[usize; 2]>>>::dot
             at /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:257:9
   5: linfa_linalg::lobpcg::svd::TruncatedSvdResult<A>::values_vectors
             at /home/yuhan/projects/rust/ml/linfa-linalg/src/lobpcg/svd.rs:89:30
   6: <linfa_reduction::pca::PcaParams<f32> as linfa::traits::Fit<ndarray::ArrayBase<D,ndarray::dimension::dim::Dim<[usize; 2]>>,T,linfa_reduction::error::ReductionError>>::fit
             at ./src/pca.rs:138:43
   7: linfa_reduction::pca::tests::test_explained_variance_diag
             at ./src/pca.rs:390:21
   8: linfa_reduction::pca::tests::test_explained_variance_diag::{{closure}}
             at ./src/pca.rs:388:5
   9: core::ops::function::FnOnce::call_once
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/ops/function.rs:227:5
  10: core::ops::function::FnOnce::call_once
             at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/ops/function.rs:227:5

YuhanLiin avatar Aug 10 '22 04:08 YuhanLiin

@bytesnake any updates?

YuhanLiin avatar Oct 11 '22 02:10 YuhanLiin