TSVD.jl icon indicating copy to clipboard operation
TSVD.jl copied to clipboard

LAPACKException(2) keeps happening

Open jwmi opened this issue 5 years ago • 6 comments
trafficstars

Hi, I've been using TSVD extensively -- thanks for this great package. However, I keep getting the following error when calling tsvd. Below is a minimal example in which failure occurs around 63% of the time. The error occurs on both Mac and Windows. Is it possible to fix this issue? Thanks! Jeff (@jwmi)

> using TSVD
> tsvd([1 0 ; 0 1], 1)
ERROR: LAPACKException(2)
Stacktrace:
 [1] chklapackerror at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:38 [inlined]
 [2] bdsdc!(::Char, ::Char, ::Array{Float64,1}, ::Array{Float64,1}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:5437
 [3] #svd!#162 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203 [inlined]
 [4] svd!(::Bidiagonal{Float64,Array{Float64,1}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203
 [5] svd(::Bidiagonal{Float64,Array{Float64,1}}; kw::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207
 [6] svd at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207 [inlined]
 [7] biLanczos(::Array{Int64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:171
 [8] _tsvd(::Array{Int64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:232
 [9] #tsvd#7 at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:359 [inlined]
 [10] tsvd(::Array{Int64,2}, ::Int64) at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:359
 [11] top-level scope at REPL[12]:1

jwmi avatar Nov 05 '20 04:11 jwmi

So https://github.com/JuliaLinearAlgebra/TSVD.jl/pull/26 fixes your minimal reproducer but I'd be interested in knowing if it also fixes your real use cases so please comment if the issue remains after the patch release.

andreasnoack avatar Nov 05 '20 08:11 andreasnoack

Hi @andreasnoack Thanks for working on this! Even with this patch release, I'm still running into the same error in general use cases. Here is a simple simulation in which the error occurs on rank-one matrices.

using TSVD
I,J,K,k,n = 10,2,1,1,10000
for i = 1:n; tsvd(randn(I,K)*randn(J,K)',k); end
ERROR: LinearAlgebra.LAPACKException(2)
Stacktrace:
 [1] chklapackerror at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:38 [inlined]
 [2] bdsdc!(::Char, ::Char, ::Array{Float64,1}, ::Array{Float64,1}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:5437
 [3] #svd!#162 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203 [inlined]
 [4] svd!(::LinearAlgebra.Bidiagonal{Float64,Array{Float64,1}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203
 [5] svd(::LinearAlgebra.Bidiagonal{Float64,Array{Float64,1}}; kw::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207
 [6] svd at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207 [inlined]
 [7] biLanczos(::Array{Float64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:175
 [8] _tsvd(::Array{Float64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:236
 [9] #tsvd#7 at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:363 [inlined]
 [10] tsvd(::Array{Float64,2}, ::Int64) at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:363
 [11] top-level scope at .\REPL[3]:1

The frequency with which an error occurs depends on the dimensions and rank of the matrix. Playing around a bit with the following chunk of code, it seems to occur more frequently on matrices in which one of the two dimensions is small and the rank is low.

I,J,K,k,n = 10,2,1,1,10000
f(A,k) = (try; tsvd(A,k); 0; catch; 1; end)
error_rate = sum([f(randn(I,K)*randn(J,K)',k) for i=1:n])/n
println("Error rate = ",error_rate)
Error rate = 0.0636

jwmi avatar Nov 05 '20 20:11 jwmi

Thanks for the new example. I think a proper fix here will require a some real effort. One of the new vectors is proportional to the previous basis vector so it ends as the zero vector after the axpy. I couldn't find a description of the issue in @rmlarsen's thesis but I might have missed it. Unfortunately, I don't have time to dive into the details right now, but I plan to take a closer look as soon as I have some spare cycles.

andreasnoack avatar Nov 07 '20 20:11 andreasnoack

This is an absolutely essential issue to deal with then implementing the algorithm in finite precision arithmetic. If you study the PROPACK implementation, you'll see the code related to this. See for example Jack Poulson's copy of PROPACK here:

https://github.com/poulson/PROPACK/blob/master/double/dlanbpro.F#L386

(Easier than referring to my old tarball.)

Rasmus

On Sat, Nov 7, 2020 at 12:28 PM Andreas Noack [email protected] wrote:

Thanks for the new example. I think a proper fix here will require a some real effort. One of the new vectors is proportional to the previous basis vector so it ends as the zero vector after the axpy. I couldn't find a description of the issue in @rmlarsen https://github.com/rmlarsen's thesis but I might have missed it. Unfortunately, I don't have time to dive into the details right now, but I plan to take a closer look as soon as I have some spare cycles.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaLinearAlgebra/TSVD.jl/issues/25#issuecomment-723491612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA72DQI2NVH6RSIESDA6ILSOWUX7ANCNFSM4TK3JQSQ .

rmlarsen avatar Nov 08 '20 20:11 rmlarsen

BTW: There are lots of little details related to estimating norms for the stopping criterion, refining error bounds etc. that are not discussed in their final form in the thesis.

On Sun, Nov 8, 2020 at 12:11 PM Rasmus Munk Larsen [email protected] wrote:

This is an absolutely essential issue to deal with then implementing the algorithm in finite precision arithmetic. If you study the PROPACK implementation, you'll see the code related to this. See for example Jack Poulson's copy of PROPACK here:

https://github.com/poulson/PROPACK/blob/master/double/dlanbpro.F#L386

(Easier than referring to my old tarball.)

Rasmus

On Sat, Nov 7, 2020 at 12:28 PM Andreas Noack [email protected] wrote:

Thanks for the new example. I think a proper fix here will require a some real effort. One of the new vectors is proportional to the previous basis vector so it ends as the zero vector after the axpy. I couldn't find a description of the issue in @rmlarsen https://github.com/rmlarsen's thesis but I might have missed it. Unfortunately, I don't have time to dive into the details right now, but I plan to take a closer look as soon as I have some spare cycles.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaLinearAlgebra/TSVD.jl/issues/25#issuecomment-723491612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA72DQI2NVH6RSIESDA6ILSOWUX7ANCNFSM4TK3JQSQ .

rmlarsen avatar Nov 08 '20 20:11 rmlarsen

I'm still getting this error fyi. Has anyone tried adapting the lines of code from the PROPACK reference above to the KSVD.jl implementation?

EDIT: Actually I found in my case that this happened in two cases:

  1. if I put in a (Nx1) vector into tsvd, which could maybe be considered user error. (This happened sometimes automatically during another algorithm).
  2. Somewhat randomly, when no orthogonal vector is found. Inspired by the link to PROPACK above, I inserted a reorthogonalization step after v and alpha are computed (svd.jl:54):
            if α < τ * maximum(size(A)) * eps()  # orthogonalization failed, see https://github.com/poulson/PROPACK/blob/2465f89d5b1fba56de71c3e69e27d017c3dc2295/double/dlanbpro.F#L384
                @warn "Restart orthogonalization"
                b = V[1:(j-1)]
                B = KrylovKit.OrthonormalBasis(b ./ norm.(b))
                for _ in 1:3
                    v = rand!(v) * 2 .- 1
                    KrylovKit.orthonormalize!(v, B, KrylovKit.ModifiedGramSchmidt())
                    α = norm(v)
                    if !(α < τ * maximum(size(A)) * eps())
                        break
                    end
                end
            end

This seems to fix my issue for now.

RomeoV avatar Oct 20 '23 21:10 RomeoV