qiskit icon indicating copy to clipboard operation
qiskit copied to clipboard

Use of general eigenvalues solver in weyl_coordinates sometimes throws convergence error

Open levbishop opened this issue 2 years ago • 4 comments

It looks like the optimizations in #6896 can sometimes cause problems with convergence. @albertzhu01 found several cases that triggered this problem, although there seems to be some environment-dependence that makes it hard to reproduce across machines - the resulting almost-identity-matrix up to 1.E-16 scale deviations can cause error in lapack geev.

Any thoughts @jakelishman ?

---- from qiskit-experiments issue https://github.com/Qiskit/qiskit-experiments/issues/846#issuecomment-1205772386_

The following test case reliably reproduces the error:

import numpy as np
from scipy import linalg as la
mat = np.array(
    [
        [0.9999999999999998j, 0j, -3.469446951953614e-18j, 0j], 
        [0j, 0.9999999999999998j, (8.673617379884035e-19+2.6020852139652106e-18j), -6.938893903907228e-18j], 
        [-3.469446951953614e-18j, (8.673617379884035e-19+2.6020852139652106e-18j), 0.9999999999999998j, 0j], 
        [0j, -6.938893903907228e-18j, 0j, 0.9999999999999998j]
    ]
)
la.eigvals(mat)

Output:

LinAlgError                               Traceback (most recent call last)
Input In [17], in <cell line: 9>()
      1 from scipy import linalg as la
      2 mat = np.array(
      3     [
      4         [0.9999999999999998j, 0j, -3.469446951953614e-18j, 0j], 
   (...)
      7         [0j, -6.938893903907228e-18j, 0j, 0.9999999999999998j]]
      8 )
----> 9 la.eigvals(mat)

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:880, in eigvals(a, b, overwrite_a, check_finite, homogeneous_eigvals)
    811 def eigvals(a, b=None, overwrite_a=False, check_finite=True,
    812             homogeneous_eigvals=False):
    813     """
    814     Compute eigenvalues from an ordinary or generalized eigenvalue problem.
    815 
   (...)
    878 
    879     """
--> 880     return eig(a, b=b, left=0, right=0, overwrite_a=overwrite_a,
    881                check_finite=check_finite,
    882                homogeneous_eigvals=homogeneous_eigvals)

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:247, in eig(a, b, left, right, overwrite_a, overwrite_b, check_finite, homogeneous_eigvals)
    244     w = wr + _I * wi
    245     w = _make_eigvals(w, None, homogeneous_eigvals)
--> 247 _check_info(info, 'eig algorithm (geev)',
    248             positive='did not converge (only eigenvalues '
    249                      'with order >= %d have converged)')
    251 only_real = numpy.all(w.imag == 0.0)
    252 if not (geev.typecode in 'cz' or only_real):

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:1356, in _check_info(info, driver, positive)
   1353     raise ValueError('illegal value in argument %d of internal %s'
   1354                      % (-info, driver))
   1355 if info > 0 and positive:
-> 1356     raise LinAlgError(("%s " + positive) % (driver, info,))

LinAlgError: eig algorithm (geev) did not converge (only eigenvalues with order >= 2 have converged)

Originally posted by @albertzhu01 in https://github.com/Qiskit/qiskit-experiments/issues/846#issuecomment-1205772386

levbishop avatar Aug 04 '22 21:08 levbishop

A crude hack is: trap the exception and check if the code returned (info) satisfies info>0. If so, add normally distributed random numbers with standard deviation 1e-20 to the matrix and try to compute the eigenvalues again. That hack works in this case.

It could be that it's not just that the matrix is almost the identity. The degeneracy in the eigenvalues, or the diagonal, is necessary to cause the error... in this case at least. Changing one of the diagonal entries so that it ends in 9 rather than 8 is enough to see success.

In this case, multiplying the matrix by $\sqrt{-1}$ also avoids the error.

jlapeyre avatar Aug 05 '22 05:08 jlapeyre

A crude hack is: trap the exception and check if the code returned (info) satisfies info>0. If so, add normally distributed random numbers with standard deviation 1e-20 to the matrix and try to compute the eigenvalues again. That hack works in this case.

The changes in #6896 were intended to avoid the randomized algorithm in TwoQubitWeylDecomposition, which after a lot of battle-hardening seems finally 🤞 immune to such pathologies. I guess rather than trapping the error and ad-hoc randomizing, I'd rather trap the error and fall back to the TwoQubitWeylDecomposition alogorithm. Although having two paths like this worries me, and if the general eigensolver is unreliable maybe its better to simply use the TwoQubitWeylDecomposition algorithm everywhere unless the performance hit is substantial.

levbishop avatar Aug 05 '22 15:08 levbishop

I think the use of randomization is quite different. And pretty small in terms of code and complexity. But, I understand that introducing another piece of randomness might result in a series of tweaks as failing cases trickle in under general use.

If there are understandable characteristics of offending matrices, we could make some kind of test suite in order to be more confident (if not perfectly) that a solution will work. I suppose a small subset of these would go in the big test suite.

EDIT: There already is something like this

The terra testsuite exercises a bunch of the pathological cases. It would be interesting to see if the testsuite passes on your setup. Something like python -m unittest test.python.quantum_info.test_synthesis -cb should run the relevant pieces

jlapeyre avatar Aug 05 '22 20:08 jlapeyre

Apropos environment-specific features: The example above succeeds when using MKL (on my machine). I'm not 100% sure because I used Julia (I didnt look up yet how to use mkl with scipy.linalg). But, it looks like python and julia are calling the same lapack routines. Furthermore, with openblas, Julia fails with the same error as numpy.

jlapeyre avatar Aug 05 '22 22:08 jlapeyre

It's definitely a function of the environment beyond just the specific blas implementation. On my linux system I do not encounter any failures with openblas, but on my m1 mac also using openblas it does reliably fail.

mtreinish avatar Oct 03 '23 19:10 mtreinish