Arpack.jl icon indicating copy to clipboard operation
Arpack.jl copied to clipboard

Wrong eigenvalues on first run

Open david-pl opened this issue 4 years ago • 12 comments

Hi!

When updating the version of Arpack in a package I noticed tests started to fail some of the time. The following code reproduces the issue. The test at the end fails on Arpack v0.5, but only on the first run. On later runs, it works. I apologize for the somewhat convoluted construction of matrices here, but whenever I tried to simplify the error no longer occurred.

using Arpack
using LinearAlgebra, SparseArrays, Random, Test

Random.seed!(0)

ωc = 1.2
ωa = 0.9
γ = 0.5
κ = 1.1

sz = sparse(ComplexF64[1 0; 0 -1])
sp = sparse(ComplexF64[0 1; 0 0])
sm = sparse(collect(sp'))
ids = one(sz)

a = sparse(diagm(1 => ComplexF64[sqrt(i) for i=1:10]))
ida = one(a)

Ha = kron(ida, 0.5*ωa*sz)
Hc = kron(ωc*a'*a, ids)
Hint = sparse(kron(a', sm) + kron(a, sp))
H = Ha + Hc + Hint

Ja = kron(ida, sqrt(γ)*sm)
Jc = kron(sqrt(κ)*a, ids)
J = sqrt(2) .* [Ja, Jc]
Jdagger = adjoint.(J);
rates = 0.5 .* ones(length(J))

spre(x) = kron(one(x), x)
spost(x) = kron(permutedims(x), one(x))

L = spre(-1im*H) + spost(1im*H)
for i=1:length(J)
    jdagger_j = rates[i]/2*Jdagger[i]*J[i]
    global L -= spre(jdagger_j) + spost(jdagger_j)
    global L += spre(rates[i]*J[i]) * spost(Jdagger[i])
end
d, rest = eigs(L, nev=2, which=:LR)

@test abs(d[1]) < 1e-9

This worked fine on v0.4. Note that when using eigen(Matrix(L)) the eigenvalue with the largest real part is 0.

Version info:

Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

david-pl avatar Dec 29 '20 17:12 david-pl

I can reproduce this issue and it goes away if you downgrade to version 3.5 of Arpack_jll, i.e. to version 3.5. of arpack-ng. I noticed something similar some time ago when I tried to upgrade to version 3.7, see https://github.com/JuliaLinearAlgebra/Arpack.jl/issues/76#issuecomment-560423467. It seems like the arpack-ng developers are making changes to the actual algorithm and not just the building/packaging.

andreasnoack avatar Dec 30 '20 21:12 andreasnoack

I will tag a new release that reverts the Arpack_jll update. We should also add this example to our tests and file the issue upstream.

ViralBShah avatar Dec 31 '20 16:12 ViralBShah

They seem to have made small tiny bugfixes in the last couple of years, and I wonder if those affect us. It would be nice if someone can investigate.

ViralBShah avatar Dec 31 '20 16:12 ViralBShah

In the above test, if I ask for 3 eigenvalues or more, then the test passes.

julia> d, rest = eigs(L, nev=3, which=:LR);

julia> @test abs(d[1]) < 1e-9
Test Passed

ViralBShah avatar Dec 31 '20 19:12 ViralBShah

Also, as reported, running it repeatedly makes it pass sometimes.

julia> @test abs(d[1]) < 1e-9
Test Failed at REPL[60]:1
  Expression: abs(d[1]) < 1.0e-9
   Evaluated: 2.0933319522234535 < 1.0e-9
ERROR: There was an error during testing

julia> d, rest = eigs(L, nev=2, which=:LR);

julia> @test abs(d[1]) < 1e-9
Test Failed at REPL[62]:1
  Expression: abs(d[1]) < 1.0e-9
   Evaluated: 2.093331952223471 < 1.0e-9
ERROR: There was an error during testing

julia> d, rest = eigs(L, nev=2, which=:LR);

julia> @test abs(d[1]) < 1e-9
Test Passed

While running it repeatedly, I also found this error in one of the runs:

julia> d, rest = eigs(L, nev=2, which=:LR); @test abs(d[1]) < 1e-9
ERROR: 
┌ Error: XYAUPD_Exception: Please check XYAUPD error codes in the ARPACK manual.
│   info = 1
└ @ Arpack ~/.julia/dev/Arpack/src/libarpack.jl:24

Stacktrace:
 [1] aupd_wrapper(::Type{T} where T, ::Arpack.var"#18#28"{SparseMatrixCSC{Complex{Float64},Int64}}, ::Arpack.var"#19#29", ::Arpack.var"#20#30", ::Int64, ::Bool, ::Bool, ::String, ::Int64, ::Int64, ::String, ::Float64, ::Int64, ::Int64, ::Array{Complex{Float64},1}) at /Users/viral/.julia/dev/Arpack/src/libarpack.jl:92
 [2] _eigs(::SparseMatrixCSC{Complex{Float64},Int64}, ::UniformScaling{Bool}; nev::Int64, ncv::Int64, which::Symbol, tol::Float64, maxiter::Int64, sigma::Nothing, v0::Array{Complex{Float64},1}, ritzvec::Bool, explicittransform::Symbol) at /Users/viral/.julia/dev/Arpack/src/Arpack.jl:235
 [3] #eigs#10 at /Users/viral/.julia/dev/Arpack/src/Arpack.jl:47 [inlined]
 [4] #eigs#9 at /Users/viral/.julia/dev/Arpack/src/Arpack.jl:46 [inlined]
 [5] top-level scope at REPL[65]:1

ViralBShah avatar Dec 31 '20 19:12 ViralBShah

I bisected and found that the failure is introduced between https://github.com/opencollab/arpack-ng/commit/0e7d01d34b48fa092a58efbffed41267c03aa2af (last known good commit) and https://github.com/opencollab/arpack-ng/commit/7fc42e5f775fe1e5c5d444b666b01d54117f1c3f.

Unfortunately, many of the shared library builds fail in that range making it difficult to pinpoint, and some of the successful builds complain about not enough space to build the factorization (which feels like a bug since it is working fine in earlier versions).

ViralBShah avatar Nov 17 '21 17:11 ViralBShah

If I have done my analysis right - the failure is introduced in https://github.com/opencollab/arpack-ng/compare/0e7d01d..7fc42e5

ViralBShah avatar Nov 17 '21 18:11 ViralBShah

@andreasnoack This test fails on ARPACK 3.5 that we ship as well, and I can reproduce it with Arpack.jl 0.4 as well. The error itself appears to be due to different sorting of eigenvalues.

julia> for i=1:10000;  d, rest = eigs(L, nev=2, which=:LR); @test abs(d[1]) < 1e-9; end

Test Failed at REPL[35]:1
  Expression: abs(d[1]) < 1.0e-9
   Evaluated: 2.0933319522234384 < 1.0e-9
ERROR: There was an error during testing

Thus I am not sure if my earlier analyses of where the bug was introduced is correct.

ViralBShah avatar Oct 21 '22 03:10 ViralBShah

If I use a rand initialized v0, I can't trigger the failure with ARPACK 3.5.

ViralBShah avatar Oct 21 '22 13:10 ViralBShah

What changed relative to https://github.com/JuliaLinearAlgebra/Arpack.jl/issues/118#issuecomment-971835985? Is it something about how the library is built?

andreasnoack avatar Oct 21 '22 13:10 andreasnoack

More findings for ARPACK 3.5 vs. 3.8 https://github.com/JuliaLinearAlgebra/Arpack.jl/issues/138#issuecomment-1287827378. 3.8 fails reliably even with an initialization vector - fails always.

ViralBShah avatar Oct 22 '22 15:10 ViralBShah

Here's the matrix from #118 as a gist:

https://gist.github.com/ViralBShah/ae892fe6df0884185313f3f0e10c03cc

ViralBShah avatar Feb 28 '23 01:02 ViralBShah