idaes-pse
idaes-pse copied to clipboard
Degeneracy Hunter SVD analysis
Using Degeneracy Hunter to debug a flowsheet that isn't converging, I'm running into problems with scipy.sparse.linalg.svds
in computing the smallest singular values of the Jacobian. Apparently, ARPACK
has problems finding small eigenvalues without shifting the spectrum, and svds
makes no attempt to shift the spectrum. We probably need to create a special implementation of svds
for our purposes here.
A minor, tangentially related issue is that check_rank_equality_constraints
only computes the SVD once, then returns the same SVD for future calls. This would cause problems if a user, for example, wanted to check the rank at the initialization point, then check the rank again at the point IPOPT converges to,.
@dallan-keylogic, let me know if you disagree with the priority and target release.
I've had success using solver="lobpcg"
in svds
.
@dallan-keylogic How large is the model? Are there specific solver options we should add in Degeneracy Hunter?
@adowling2 It's actually only around an 800x800 square problem. Using a normal SVD works fine. Using solver="lobpcg"
didn't help in my case. We need to form the matrix X'X then call eigsh
in shift-invert mode.
I'm going to create a small PR replacing svds
with svd
, as well as some other necessary changes (get the scaled Jacobian rather than the raw Jacobian and make sure pretty printing constraints isn't on by default so users don't get bombarded with 20 pages of thermo garbage). We can homebrew our own svds
using the right option in the medium term.
@dallan-keylogic Good plan. My suggestion is to add svd_option
as an input argument to degeneracy hunter. We can then support dense
, sparse
, etc.
@shermanjasonaf has another moderate size system (7698 x 15484) where this is causing an issue. @dallan-keylogic where does this stand? @shermanjasonaf (with help from me) may be willing to test out changes in a draft PR.
I spent a morning trying a few things without success, then just resigned myself to using the dense SVD. However, I finally have some time to work on this again, so hopefully I can find a good working method.
@adowling2 @shermanjasonaf I pushed my first attempt at a homebrew version of svds
to the homebrew_svds
branch of my IDAES fork.
Like svds
, we use scipy.sparse.linalg.eigsh
on the normal matrix in order to compute a set of singular vectors. Unfortunately, the price we pay for adequate performance is the introduction of a tuning parameter, the shift for inverse iteration in eigsh
. Because we've formed the normal matrix, a singular value of 1e-8
corresponds to a normal eigenvalue of 1e-16
, so we need to use an extremely small shift and may encounter significant roundoff error. However, if we choose a shift too close to a singular value, the iteration matrix becomes singular and Scipy throws an error, so some tuning might be necessary in order to find a good shift value.
Comparing the output obtained through this homebrew method to that obtained from the dense svd (on a ~4000 x 4000 problem), there were surprising differences in the computed singular values and vectors. Sparse svd methods can skip singular values because they depend on random initialization, but it seemed like the dense svd skipped a singular value as well, which shouldn't happen. The corresponding singular vectors for (more or less) corresponding singular values calculated by the different methods were not always orthogonal, either. My guess is that numerical weirdness is happening because we're operating in the domain of roundoff error.
My conclusion was that both dense and sparse methods returned sets of orthonormal vectors whose singular values accurately represented the action of the Jacobian matrix on those singular vectors. That's really all we need (or at least all I need) for Degeneracy Hunter. The sparse method was two orders of magnitude more accurate in terms of representing the action of the Jacobian on its set of singular vectors, but that was ~1e-17
vs. 1e-15
for singular values in the range of 1e-8
through 1e-6
.
I'll be interested in seeing how the method performs on the problems @shermanjasonaf is working on. I haven't invested heavily in exception catching yet because I'd like a better idea of what we're up against.
Discussion from today:
- We really want to estimate the "almost" null space
- Rank revealing QR? https://andreask.cs.illinois.edu/cs598apk-f15/demos/02-tools-for-low-rank/Rank-Revealing%20QR.html
- https://www.tspi.at/2021/12/08/qrgivens.html
Idea we discuss today:
$$ \min_{v} || J \cdot v ||, \quad \mathrm{s.t.} ~ || v || = 1 $$
The objective could use an L-2 norm and likely still be a convex problem. For the constraint, we could use an L-1 norm. Why $v$ would not be precisely a singular vector, it would be a vector in the "almost null space".
We could then solve a subsequent problem:
$$ \min_{v} w || J \cdot v || + ||v||, \quad \mathrm{s.t.} ~ || v || \geq 1 $$
Where $||v||$ in the objective is L-1 to induce some sparsity and $||v||$ in the constraint is L-2 to find a unit vector.
After finding $v_1$, we could add a constraint that $v_1 \cdot v = 0$ to induce orthagonality (with v_1 fixed as a parameter).