Integration testing for LOBPCG
I saw some interesting failures while property testing LOBPCG. The failure cases are covered in the tests problematic_eig_matrix and problematic_svd_matrix. The eig test straight-up yields the wrong eigenvalues and the SVD test yields the correct singular values but without the last value. @bytesnake any ideas on what's wrong?
Hi YuhanLii, LOBPCG should not be used for full decompositions, the python source bails out when you want more than 1/5 of the eigenvectors/vals. https://github.com/scipy/scipy/blob/v1.8.1/scipy/sparse/linalg/_eigen/lobpcg/lobpcg.py#L339 I think it should be relatively easy to find problems where this fails. The missing eigenvalue should not occur though and requires more debugging. Don't have time today, but can give it a shot tomorrow
Does the algorithm also fail if you compute the full decomp one eigenvalue at a time using the iterator? Does it also apply to SVD (meaning that you can't use it to compute full SVD)? Also, we should add the check from the Python code to our impl as well, and return an error if the user requests too many eigvals.
For the missing value, I think it may be because the rank of the matrix is lower than the number of eigvals requested, in which case the behaviour is correct.
For the missing value, I think it may be because the rank of the matrix is lower than the number of eigvals requested, in which case the behaviour is correct.
yes that would be an explanation
Does the algorithm also fail if you compute the full decomp one eigenvalue at a time using the iterator? Does it also apply to SVD (meaning that you can't use it to compute full SVD)? Also, we should add the check from the Python code to our impl as well, and return an error if the user requests too many eigvals.
yes it applies to any solver derived from LOBPCG, I would add brakes to SVD and eigenvalue decomposition. The question is how many, I observed in application that 1/5 is a conservative bound but for most usecases sufficient
Actually it can't be due to the rank because none of the single values of the input matrix in problematic_svd_matrix are 0
Also, the Marchenko-Pastur test uses a 1000x500 matrix but obtains 500 singular values. Doesn't that violate the 1/5 rule?
Actually it can't be due to the rank because none of the single values of the input matrix in
problematic_svd_matrixare 0Also, the Marchenko-Pastur test uses a 1000x500 matrix but obtains 500 singular values. Doesn't that violate the 1/5 rule?
Yes, of course it does.
The 1/5 rule is there not only to prevent lobpcg possible failures, but also to enforce using a standard dense solver in situations where it is likely much more efficient compared to lobpcg.
sorry for the long silence, just to add an explanation: I have never seen any failure for the Marchenko-Pastur distributed random matrices and therefore kept comparing the whole spectrum instead of first fifth.
Then maybe we can have the internal algorithm solve the whole spectrum for testing but have the public API limit it to 1/5th. Or just change the Marchenko-Pastur test to do 1/5th so it runs faster.
Then maybe we can have the internal algorithm solve the whole spectrum for testing but have the public API limit it to 1/5th. Or just change the Marchenko-Pastur test to do 1/5th so it runs faster.
here is a proposal https://github.com/rust-ml/linfa-linalg/commit/a780910fd5638b134ee5db5ba03198a08a07fea7 add branch for dense eigenproblem solver for TruncatedSvd and TruncatedEig
@bytesnake I've implemented your change and it works on the previous failing cases. However, I'm getting another failing matrix on the truncated eigenvalues, which I've put into the problematic_eig test. The issue is that the eigenvector is wrong, even when extracting just 1 eigenvalue.
Update: When I run the linfa PCA tests against this branch, I get the following failure:
thread 'pca::tests::test_explained_variance_diag' panicked at 'ndarray: inputs 4 × 4 and 3 × 3 are not compatible for matrix multiplication', /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:299:5
stack backtrace:
0: rust_begin_unwind
at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:498:5
1: core::panicking::panic_fmt
at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/panicking.rs:116:14
2: ndarray::linalg::impl_linalg::dot_shape_error
at /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:299:5
3: <ndarray::ArrayBase<S,ndarray::dimension::dim::Dim<[usize; 2]>> as ndarray::linalg::impl_linalg::Dot<ndarray::ArrayBase<S2,ndarray::dimension::dim::Dim<[usize; 2]>>>>::dot
at /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:273:13
4: ndarray::linalg::impl_linalg::<impl ndarray::ArrayBase<S,ndarray::dimension::dim::Dim<[usize; 2]>>>::dot
at /home/yuhan/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.4/src/linalg/impl_linalg.rs:257:9
5: linfa_linalg::lobpcg::svd::TruncatedSvdResult<A>::values_vectors
at /home/yuhan/projects/rust/ml/linfa-linalg/src/lobpcg/svd.rs:89:30
6: <linfa_reduction::pca::PcaParams<f32> as linfa::traits::Fit<ndarray::ArrayBase<D,ndarray::dimension::dim::Dim<[usize; 2]>>,T,linfa_reduction::error::ReductionError>>::fit
at ./src/pca.rs:138:43
7: linfa_reduction::pca::tests::test_explained_variance_diag
at ./src/pca.rs:390:21
8: linfa_reduction::pca::tests::test_explained_variance_diag::{{closure}}
at ./src/pca.rs:388:5
9: core::ops::function::FnOnce::call_once
at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/ops/function.rs:227:5
10: core::ops::function::FnOnce::call_once
at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/core/src/ops/function.rs:227:5
@bytesnake any updates?