KrylovKit.jl icon indicating copy to clipboard operation
KrylovKit.jl copied to clipboard

Add Two-sided Arnoldi method

Open tipfom opened this issue 11 months ago • 25 comments

I've added an implementation of the Two-sided Krylov-Schur Restarted Arnoldi method in Ref. [1, Alg. 2] to KrylovKit.jl. The implementation is not done yet and needs some improvements (e.g., unit tests and an update to the convergence criterion).

It is currently only tested for finding the smallest real eigenvalue :SR for dense Julia matrices. The interface is as close as possible to the other methods, i.e., for a dense matrix A calling eigsolve(A, collect(adjoint(A)), v0, w0, 4, :SR, BiArnoldi(; krylovdim=20, verbosity=10, maxiter=100)) yields the 4 smallest real value left and right eigenvalues, vectors and convergence information.

Are you interested in merging this into KrylovKit.jl? If so, what should I additionally add (next to tests and the improved convergence criterion)?

I'm interested in using this method downstream (in ITensors.jl) and hence incorporating this into the package would be very much appreciated. I'm willing to do the extra work required, if it is not too much.

Thanks for your work!

[1] Krylov-Schur-Type Restarts for the Two-Sided Arnoldi Method, Ian N. Zwaan and Michiel E. Hochstenbach

tipfom avatar Mar 19 '25 14:03 tipfom

Thanks for this; it seems like an impressive amount of work. I will have to look deeper into this, since I am on my way out of the office right now. Feel free to remind me if you haven't heard back in a few days.

A first comment: look at svdsolve and lssolve for the interface I use for methods that require both the linear map and its adjoint.

Related to this: since this algorithm can not have the exact same arguments and outputs as the other eigsolve routines, I would pack it under a different name, such as bieigsolve or eigsolve2. At some point, we were actually discussing having a new routine like this, when we believed that to implement the pullback of eigsolve, we would actually also need the corresponding left eigenvectors. However, since I then found a way to compute the pullback without needing the left eigenvectors, we halted the discussion of implementing such a method, however, I am still very supportive of having it.

Jutho avatar Mar 19 '25 17:03 Jutho

Great to hear that this may find its way into the library! I've now updated the function signature to be closer to lssolve and renamed it to bieigsolve. Also, there are now tests and some further improvements to the convergence criterion etc.

However, in my tests (which I copied from the eigsolve tests), I get fails because MinimalVec does not support dot and normalize!, which I need to update the oblique projection. Is this something to care about? Some @test_logs fail as well.

Also, the return type of bieigsolve is currently still to clumsy, as it returns both left/right eigenvalues/vectors and convergence info. I'm interested in your suggestions on this. Additionally, the returned residuals are not the actual residuals. For the latter, I might need to apply the function again to the obtained eigenvectors, which I did not deem cost-efficient for this information. This still needs improvements as well.

tipfom avatar Mar 20 '25 13:03 tipfom

dot should just be inner. normalize! could be scale!(v, 1/norm(v)); the problem is probably that normalize! from LinearAlgebra.jl calls rmul! or something, which is indeed not defined for MinimalVec.

Jutho avatar Mar 20 '25 17:03 Jutho

Codecov Report

:x: Patch coverage is 87.54717% with 33 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 88.70%. Comparing base (054a6ae) to head (29f185f). :warning: Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/factorizations/biarnoldi.jl 66.66% 16 Missing :warning:
src/eigsolve/biarnoldi.jl 92.51% 14 Missing :warning:
src/innerproductvec.jl 50.00% 2 Missing :warning:
src/eigsolve/arnoldi.jl 95.65% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #124      +/-   ##
==========================================
- Coverage   88.90%   88.70%   -0.20%     
==========================================
  Files          34       36       +2     
  Lines        3685     3930     +245     
==========================================
+ Hits         3276     3486     +210     
- Misses        409      444      +35     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Mar 22 '25 23:03 codecov[bot]

@tipfom , are you ok with me pushing some changes directly to your branch?

Jutho avatar Mar 24 '25 19:03 Jutho

@Jutho ~~Yes, I'll take care of it and also add docstrings :)~~

tipfom avatar Mar 24 '25 19:03 tipfom

@Jutho, sorry, I misread your message. Of course you're very much welcome to push to this branch :)

tipfom avatar Mar 24 '25 19:03 tipfom

Ok, I've simplified the BiArnoldi factorization and iteration to simply reuse everything from Arnoldi, as I think that was wat it amounted to. It's easier to maintain the code and make improvements in the future if there is less code duplication.

In bieigsolve, I've currently only made the changes required to make it work in combination with the changed factorization type. I will go through that function in more detail but this will have to wait until tomorrow or Wednesday. However, the code is very nicely structured and written, so I don't expect anything stalling this much longer. Hats off for a very nice PR!

Jutho avatar Mar 24 '25 22:03 Jutho

The eigenvalues are now properly sorted and the eigenvalues properly biorthonormalized.

tipfom avatar Mar 25 '25 12:03 tipfom

My apologies for the slow progress from my side; it's a busy week. I've left another question in the comments.

Jutho avatar Mar 27 '25 22:03 Jutho

My apologies for the long silence, I was on holiday last week and didn't really get to continue on this. I will try to finish my review / changes from my side this week.

Jutho avatar Apr 14 '25 12:04 Jutho

Looking at this once again, I am making some more changes locally and will push them asap. However, I have two more questions:

  1. One of my comments regarding the residual computation was that I think for kappa_j you need not simply M[converged+1,converged+1] but rather Z[:,converged+1]'*M*Q[:,converged+1].

  2. Can you elucidate what the problem is with the eigenvalues of H and K not being matched up perfectly? Is this only about complex conjugate eigenvalue pairs in the case of using real arithmetic? Or is it more general? You know only fix this in the final step, where you compute the eigenvectors, but I would like for this to be true during the Schur process. I am not sure if it is strictly necessary, but I could foresee problems, where e.g. in deciding on the keep variable, it might be that both H and K have 2 x 2 blocks that misalign, so that no keep can be chosen that doesn't cut through blocks in either of those two matrices.

Jutho avatar Apr 17 '25 17:04 Jutho

Actually, looking more in the computation of the convergence criterion, I don't understand why the authors want to use a relative precision where they divide by rho_j, being the eigenvalue. This doesn't make sense in my opinion. It can be perfectly valid to target eigenvalues that happen to be close to zero, as long as they are extremal. Furthermore, adding identity to an operator changes nothing to the Krylov subspace and thus to the method, but arbitrarily shifts eigenvalues and thus the meaning of relative errors. If you agree with this, I would rather opt for a convergence criterion that only uses absolute norms of residuals in comparison to tol.

Jutho avatar Apr 18 '25 10:04 Jutho

  1. One of my comments regarding the residual computation was that I think for kappa_j you need not simply M[converged+1,converged+1] but rather Z[:,converged+1]'*M*Q[:,converged+1].

Yes, that is correct. I'm however fine with dropping this prefactor alltogether as you suggested in your other comment. :)

  1. Can you elucidate what the problem is with the eigenvalues of H and K not being matched up perfectly? Is this only about complex conjugate eigenvalue pairs in the case of using real arithmetic? Or is it more general? You know only fix this in the final step, where you compute the eigenvectors, but I would like for this to be true during the Schur process. I am not sure if it is strictly necessary, but I could foresee problems, where e.g. in deciding on the keep variable, it might be that both H and K have 2 x 2 blocks that misalign, so that no keep can be chosen that doesn't cut through blocks in either of those two matrices.

I've only encountered this problem when working with (mathematically) real matrices, i.e., independent of the arithmetic type (complex or real), but those who have the property that both $\lambda$ and $\lambda^\star$ are eigenvalues. Then it may happen that when sorting for smallest real the eigenvalues of $H$ and $K$ are $[\ldots, \lambda, \lambda^\star, \ldots]$ and $[\ldots, \lambda^\star, \lambda, \ldots]$ respectively. To prevent this, I implented this sort-and-match algorithm. For the keep part, I thought that if we put the keep threshold in between the two, i.e., keep $\lambda$ in $H$ and $\lambda^\star$ in $K$, the two of them would technically be the same as unconverged eigenvectors.

Thats as much as I know. Sadly, I did not find any information on this in the paper. When I tried enforcing this explicit ordering in the Krylov decomposition by adjusting pK = sortperm(valuesK; by=by ∘ conj, rev=rev) to also match to pH, I encountered the cannot split 2x2 blocks when permuting schur decomposition when calling permuteschur! and therefore dropped this approach.

tipfom avatar Apr 22 '25 08:04 tipfom

Ok thanks for the comments; that is very helpful. When working with real arithmetic, a valid which argument should anyway be such that it doesn't distinguish between $$\lambda$$ and $$\lambda^ast$$ (even though that is hard to check in full generality if it is provided as a general EigSorter.by. The complex conjugate pairs of eigenvalues are always returned by Lapack in fixed order (first the one with negative real part, then the one with positive real part). But indeed, since the eigenvalues of H and K are related by complex conjugation, it does mean that complex pairs of eigenvalues are actually swapped between H and K. But then it is in a fully deterministic manner, and we don't need complicated matching logic I believe.

Jutho avatar Apr 22 '25 12:04 Jutho

Yes. Exactly due to this fact, the matching logic currently boils down to a linear loop which also normalizes the vectors. I'm also very open to simplifying it more :)

tipfom avatar Apr 22 '25 12:04 tipfom

I've now gotten around to using this method some more and implemented non-Hermitian DMRG using it. In this context, I noticed that the concerns raised by @lkdvos about the precision used in isapprox is very important, i.e., for early sweeps the left/right eigenvalues may not match perfectly as is the case with the more constructed examples in the test suite. Currently, the method should at least (i) allow the required precision as a parameter and (ii) not return an error if the algorithm does not converge. Have you @Jutho implemented a different way to return the "correctly" matched eigenvalue pairs? Otherwise, I will think some more about it and commit a suggested fix.

tipfom avatar Apr 30 '25 13:04 tipfom

I did some more digging and in SLEPc, they've also implemented the algorithm. However, they also need to run a match loop for the eigenvalues to be sorted properly when using the same LAPACK functionality, cf. here.

tipfom avatar May 02 '25 08:05 tipfom

My apologies for the silence, I was unavailable all of last week. I will try to pick it up again this week and finally push this over the finish line.

Jutho avatar May 05 '25 09:05 Jutho

Is there something else I can contribute to help get this finished @Jutho ? 😊 I tried to come up with a more elegant solution to the eigenvalue sorting; however, none of them really yielded satisfying results. There are some improvements to the current implementation, i.e., changing the comparison for Complex values to only compare the norm difference such that 1+1e-19im and 1+1e-16im would still compare be the same which they currently do not because isapprox is applied to both the real and the imaginary part. This is however only a minor improvement to the current implementation and does not fundamentally fix the issue.

tipfom avatar May 22 '25 13:05 tipfom

My apologies for the silence here; I got a bit distracted by other code reviews. This week I am also not actively at work very much, but I will still try to find some time. If not, then definitely early next week I will make time for finishing this.

Jutho avatar May 27 '25 11:05 Jutho

I've now updated the residual norm convention to be with rV and rW.

tipfom avatar Jun 10 '25 21:06 tipfom

Ok I finally got around spending some more time on this PR, which was really needed for me to make progress. I have changed quite a few things, but in particular the way the final eigenvectors are computed. I think that in the way I compute them, the ordering and biorthogonality of the left and right eigenvectors is automatically guaranteed. Of course, it depends on inverting (part of) the overlap matrix M.

I also changed the structure of the return values of both bieigsolve and _bischursolve. As I have not yet updated the tests, this will definitely lead to failures. I would first like to hear your thoughts on the current implementation before I also update the tests. However, I think the hard work is now done and this PR should be able to land soon.

Jutho avatar Jun 18 '25 23:06 Jutho

Hi @Jutho, I like this new approach with inverting M much better! This now also allows the algorithm to be used to get the current best estimate of both left and right eigenvector without full convergence, which is really neat. Thanks! I've pushed the changes which I needed to get it to run on my local machine.

The only major change is that at least for me working with conj(Q) and conj(Z) and h' and `k' yielded wrong results. Can you maybe confirm that this happens for you as well, e.g., by running

  using LinearAlgebra
  using KrylovKit
  using Random
  
  function main()
      Random.seed!(111111)
  
      A = rand(ComplexF64, 1000, 1000)
      x0 = rand(eltype(A), 1000)
  
      fAr = x -> A * x 
      fAl = x -> Adjoint(A) * x
  
      vals, (V, W), info = bieigsolve((fAr, fAl), x0, x0, 3, :SR, BiArnoldi(krylovdim=10, maxiter=1000, tol=1e-10))
  
      @show eigvals(A)[1:5]
  
      for i in eachindex(vals)
          @show vals[i]
          @show KrylovKit.inner(W[i], fAr(V[i]))
      end
  end
  
  main()

tipfom avatar Jun 19 '25 08:06 tipfom

@tipfom, the tests now run locally. I think the actual implementation is now complete, but the tests currently only cover the case with a single iteration of a small problem. Would you be willing to further complete the tests by adding a larger test case that is handled iteratively with more iterations, analogous to the other Arnoldi tests? I think that is the final thing that needs to happen in order to merge this PR.

Jutho avatar Jun 22 '25 08:06 Jutho

@Jutho, I've now added the tests. Sorry for the long delay :)

tipfom avatar Jul 06 '25 15:07 tipfom

Thanks, I'll take a look asap, such that this can be merged and a new KrylovKit version can be tagged. Was there a specific reason to remove the eager option?

Jutho avatar Jul 06 '25 22:07 Jutho

Yes, I removed it because as of now the option was not working as intended, i.e., the unit tests failed with eager=true. However, I did not investigate this further and don't see an immediate reason why this would necessarily be the case. Is this something we aspire to include? If so, I could spend some time tomorrow to figure out if it is an algorithmic constraint or just some operation I missed to apply when using the eager option.

tipfom avatar Jul 07 '25 11:07 tipfom

@tipfom , I finally made some time to look at this again, and would like to get this merged before tagging a new release of KrylovKit. However, currently, I get completely wrong results as soon as maxiter > 1, i.e. as soon as a restart takes place. Was this working before for you and did I manage to break something?

I did rebase this PR on the current master. Apparently I also had some uncommited changes about the role of the h and k vectors, so maybe this is what messed something up?

Jutho avatar Jul 31 '25 23:07 Jutho

It seems to be the eager option; with eager=false everything works like it should. That's probably why you removed it earlier? However, it should be possible to make this work; just a matter of finding the bug.

Jutho avatar Aug 01 '25 00:08 Jutho