TSVD.jl
TSVD.jl copied to clipboard
LAPACKException(2) keeps happening
Hi, I've been using TSVD extensively -- thanks for this great package. However, I keep getting the following error when calling tsvd. Below is a minimal example in which failure occurs around 63% of the time. The error occurs on both Mac and Windows. Is it possible to fix this issue? Thanks! Jeff (@jwmi)
> using TSVD
> tsvd([1 0 ; 0 1], 1)
ERROR: LAPACKException(2)
Stacktrace:
[1] chklapackerror at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:38 [inlined]
[2] bdsdc!(::Char, ::Char, ::Array{Float64,1}, ::Array{Float64,1}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:5437
[3] #svd!#162 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203 [inlined]
[4] svd!(::Bidiagonal{Float64,Array{Float64,1}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203
[5] svd(::Bidiagonal{Float64,Array{Float64,1}}; kw::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207
[6] svd at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207 [inlined]
[7] biLanczos(::Array{Int64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:171
[8] _tsvd(::Array{Int64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:232
[9] #tsvd#7 at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:359 [inlined]
[10] tsvd(::Array{Int64,2}, ::Int64) at C:\Users\jwmil\.julia\packages\TSVD\Kpryb\src\svd.jl:359
[11] top-level scope at REPL[12]:1
So https://github.com/JuliaLinearAlgebra/TSVD.jl/pull/26 fixes your minimal reproducer but I'd be interested in knowing if it also fixes your real use cases so please comment if the issue remains after the patch release.
Hi @andreasnoack Thanks for working on this! Even with this patch release, I'm still running into the same error in general use cases. Here is a simple simulation in which the error occurs on rank-one matrices.
using TSVD
I,J,K,k,n = 10,2,1,1,10000
for i = 1:n; tsvd(randn(I,K)*randn(J,K)',k); end
ERROR: LinearAlgebra.LAPACKException(2)
Stacktrace:
[1] chklapackerror at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:38 [inlined]
[2] bdsdc!(::Char, ::Char, ::Array{Float64,1}, ::Array{Float64,1}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\lapack.jl:5437
[3] #svd!#162 at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203 [inlined]
[4] svd!(::LinearAlgebra.Bidiagonal{Float64,Array{Float64,1}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:203
[5] svd(::LinearAlgebra.Bidiagonal{Float64,Array{Float64,1}}; kw::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207
[6] svd at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\LinearAlgebra\src\bidiag.jl:207 [inlined]
[7] biLanczos(::Array{Float64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:175
[8] _tsvd(::Array{Float64,2}, ::Int64; maxiter::Int64, initvec::Array{Float64,1}, tolconv::Float64, tolreorth::Float64, stepsize::Int64, debug::Bool) at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:236
[9] #tsvd#7 at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:363 [inlined]
[10] tsvd(::Array{Float64,2}, ::Int64) at C:\Users\jwmil\.julia\packages\TSVD\U0N8S\src\svd.jl:363
[11] top-level scope at .\REPL[3]:1
The frequency with which an error occurs depends on the dimensions and rank of the matrix. Playing around a bit with the following chunk of code, it seems to occur more frequently on matrices in which one of the two dimensions is small and the rank is low.
I,J,K,k,n = 10,2,1,1,10000
f(A,k) = (try; tsvd(A,k); 0; catch; 1; end)
error_rate = sum([f(randn(I,K)*randn(J,K)',k) for i=1:n])/n
println("Error rate = ",error_rate)
Error rate = 0.0636
Thanks for the new example. I think a proper fix here will require a some real effort. One of the new vectors is proportional to the previous basis vector so it ends as the zero vector after the axpy. I couldn't find a description of the issue in @rmlarsen's thesis but I might have missed it. Unfortunately, I don't have time to dive into the details right now, but I plan to take a closer look as soon as I have some spare cycles.
This is an absolutely essential issue to deal with then implementing the algorithm in finite precision arithmetic. If you study the PROPACK implementation, you'll see the code related to this. See for example Jack Poulson's copy of PROPACK here:
https://github.com/poulson/PROPACK/blob/master/double/dlanbpro.F#L386
(Easier than referring to my old tarball.)
Rasmus
On Sat, Nov 7, 2020 at 12:28 PM Andreas Noack [email protected] wrote:
Thanks for the new example. I think a proper fix here will require a some real effort. One of the new vectors is proportional to the previous basis vector so it ends as the zero vector after the axpy. I couldn't find a description of the issue in @rmlarsen https://github.com/rmlarsen's thesis but I might have missed it. Unfortunately, I don't have time to dive into the details right now, but I plan to take a closer look as soon as I have some spare cycles.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaLinearAlgebra/TSVD.jl/issues/25#issuecomment-723491612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA72DQI2NVH6RSIESDA6ILSOWUX7ANCNFSM4TK3JQSQ .
BTW: There are lots of little details related to estimating norms for the stopping criterion, refining error bounds etc. that are not discussed in their final form in the thesis.
On Sun, Nov 8, 2020 at 12:11 PM Rasmus Munk Larsen [email protected] wrote:
This is an absolutely essential issue to deal with then implementing the algorithm in finite precision arithmetic. If you study the PROPACK implementation, you'll see the code related to this. See for example Jack Poulson's copy of PROPACK here:
https://github.com/poulson/PROPACK/blob/master/double/dlanbpro.F#L386
(Easier than referring to my old tarball.)
Rasmus
On Sat, Nov 7, 2020 at 12:28 PM Andreas Noack [email protected] wrote:
Thanks for the new example. I think a proper fix here will require a some real effort. One of the new vectors is proportional to the previous basis vector so it ends as the zero vector after the axpy. I couldn't find a description of the issue in @rmlarsen https://github.com/rmlarsen's thesis but I might have missed it. Unfortunately, I don't have time to dive into the details right now, but I plan to take a closer look as soon as I have some spare cycles.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/JuliaLinearAlgebra/TSVD.jl/issues/25#issuecomment-723491612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA72DQI2NVH6RSIESDA6ILSOWUX7ANCNFSM4TK3JQSQ .
I'm still getting this error fyi. Has anyone tried adapting the lines of code from the PROPACK reference above to the KSVD.jl implementation?
EDIT: Actually I found in my case that this happened in two cases:
- if I put in a (Nx1) vector into
tsvd, which could maybe be considered user error. (This happened sometimes automatically during another algorithm). - Somewhat randomly, when no orthogonal vector is found. Inspired by the link to PROPACK above, I inserted
a reorthogonalization step after v and alpha are computed (
svd.jl:54):
if α < τ * maximum(size(A)) * eps() # orthogonalization failed, see https://github.com/poulson/PROPACK/blob/2465f89d5b1fba56de71c3e69e27d017c3dc2295/double/dlanbpro.F#L384
@warn "Restart orthogonalization"
b = V[1:(j-1)]
B = KrylovKit.OrthonormalBasis(b ./ norm.(b))
for _ in 1:3
v = rand!(v) * 2 .- 1
KrylovKit.orthonormalize!(v, B, KrylovKit.ModifiedGramSchmidt())
α = norm(v)
if !(α < τ * maximum(size(A)) * eps())
break
end
end
end
This seems to fix my issue for now.