`svdsolve` rrule error when not all values are converged
In one of the PEPSKit testruns, we are running into an issue where the within the rrule for svdsolve, the eigensolver is failing to converge all requested values. In this case, the output vals and vecs are not of the expected length, leading to an out of bounds error.
The stacktrace is found here
The offending line is this line .
There is already a warning being thrown, but I guess we have to limit n to the size of the computed vals and vecs.
I am confused, vals and vecs should always have length at least the requested length n, unless an invariant subspace is encountered, in which case KrylovKit has no way to further grow the Krylov subspace.
Yeah, that's exactly what it's hitting. The log is a bit cryptic (blame Zygote), but this is relevant:
[ Info: CTMRG conv 4: obj = +9.192864957180e-01 err = 4.1934293618e-11 time = 0.75 sec
┌ Warning: Invariant subspace of dimension 7 (up to requested tolerance `tol = 1.0e-8`), which is smaller than the number of requested eigenvalues (i.e. `howmany == 14`); setting `howmany = 7`.
└ @ KrylovKit ~/.julia/packages/KrylovKit/xccMN/src/eigsolve/arnoldi.jl:350
┌ Warning: `svdsolve` cotangent linear problem (7) returns unexpected result
└ @ KrylovKitChainRulesCoreExt ~/.julia/packages/KrylovKit/xccMN/ext/KrylovKitChainRulesCoreExt/svdsolve.jl:240
┌ Warning: `svdsolve` cotangent linear problem (14) returns unexpected result
└ @ KrylovKitChainRulesCoreExt ~/.julia/packages/KrylovKit/xccMN/ext/KrylovKitChainRulesCoreExt/svdsolve.jl:240
Ok that is very weird. The different eigenvectors of the linear operator are supposed to be of the form
$$((\lambda_i - A)^{-1}\Delta_i , e_i)$$
with $e_i$ the length $n$ unit vector in the $i$ direction, and this for $i=1,...,n$. The initial vector is given by $(0, e_1 + e_2 + ... + e_n)$. That should have support in each of the eigenspaces.
I will try to reproduce this locally and investigate.
Apologies that I dont have a MWE, I am not sure how to construct something for which the forward algorithm passes but the reverse fails. In any case, my guess wasn't that there necessarily is something wrong with KrylovKit, I think our tolerances were just not configured properly, but I would expect KrylovKit to throw a warning and move on, rather than throwing a boundserror
These are the singular values of the forward calculation:
[0.5827127960663261, 0.013616701381118779, 0.013616700815086135, 0.00031674568674968046, 7.429999268923436e-6, 7.4299980267290364e-6, 2.0722724010027094e-7, 2.0722722219200433e-7, 2.0722720896899213e-7, 2.0540682979946611e-7, 5.767011119369518e-9, 5.767010235714964e-9, 4.053212060637025e-9, 4.053210898281218e-9]
Clearly, there are degeneracies here, likely due to the SU2 symmetry of the Heisenberg model. As always, Krylov method cannot guarantee to find this; I think it is pretty amazing that it can find these degenerate values so well. But this is why the reverse algorithm chokes; there the degeneracies do cause the issue of the smaller invariant subspace.
I will need to think a bit about this. The fact that you don't get a "gauge dependence" warning, probably means that the adjoint variables associated with the singular vectors of degenerate singular values are all linearly dependent, so that maybe you can just replace it with having to solve a single linear system, and thus in the auxiliary eigenvalue problem that you build, you could throw out the degenerate copies.
I assume this issue has not yet magically gone away yet. How pressing is this?
As far as I know we managed to work around this in PEPSKit and I have not seen it again myself.
Ran into this again by accident. It doesn't occur in any of the regular tests anymore, but I can reproduce it by starting a PEPS optimization from a product state. Not really pressing I think, but very confusing to run into. MWE:
# using current PEPSKit.jl master...
# Pkg.add(;
# url="https://github.com/QuantumKitHub/PEPSKit.jl",
# rev="4de9e091b7f722933b28f097f1feaa8498ce5f04",
# )
using Random
using PEPSKit
using TensorKit
using OptimKit
g = 3.1
e = -1.6417 * 2
mˣ = 0.91
# initialize parameters
χbond = 2
χenv = 16
ctm_alg = SimultaneousCTMRG(; )
opt_alg = PEPSOptimize(;
boundary_alg=ctm_alg, optimizer=LBFGS(4; gradtol=1e-3, verbosity=3)
)
# initialize states
H = transverse_field_ising(InfiniteSquare(); g)
Random.seed!(72534343814)
psi_init = product_peps(2, χbond; noise_amp=2e-1)
env_init = leading_boundary(CTMRGEnv(psi_init, ComplexSpace(χenv)), psi_init, ctm_alg)
# find fixedpoint
result = fixedpoint(psi_init, H, opt_alg, env_init)
(dev) pkg> status
Status `~/git/PEPSKit.jl/dev/Project.toml`
[0b1a1467] KrylovKit v0.9.3
⌅ [77e91f04] OptimKit v0.3.1
[52969e89] PEPSKit v0.3.0 `https://github.com/QuantumKitHub/PEPSKit.jl#4de9e09`
[07d1fe3e] TensorKit v0.14.3
[9a3f8284] Random
Stacktrace
ERROR: LoadError: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:352 [inlined]
[2] fetch
@ ./task.jl:372 [inlined]
[3] fetch
@ ~/.julia/packages/StableTasks/3CrzR/src/internals.jl:9 [inlined]
[4] mapreduce_first
@ ./reduce.jl:424 [inlined]
[5] _mapreduce(f::typeof(fetch), op::typeof(BangBang.append!!), ::IndexLinear, A::Vector{StableTasks.StableTask{Any}})
@ Base ./reduce.jl:435
[6] _mapreduce_dim(f::Function, op::Function, ::Base._InitialValue, A::Vector{StableTasks.StableTask{Any}}, ::Colon)
@ Base ./reducedim.jl:365
[7] mapreduce(f::Function, op::Function, A::Vector{StableTasks.StableTask{Any}})
@ Base ./reducedim.jl:357
[8] _tmapreduce(f::Function, op::Function, Arrs::Tuple{Vector{UnitRange{Int64}}}, ::Type{Any}, scheduler::OhMyThreads.Schedulers.DynamicScheduler{OhMyThreads.Schedulers.FixedCount, ChunkSplitters.Consecutive}, mapreduce_kwargs::@NamedTuple{})
@ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:113
[9] #tmapreduce#22
@ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:85 [inlined]
[10] _tmap(scheduler::OhMyThreads.Schedulers.DynamicScheduler{OhMyThreads.Schedulers.FixedCount, ChunkSplitters.Consecutive}, f::Function, A::Array{Tuple{Tuple{…}, Zygote.var"#ad_pullback#63"{…}}, 3}, _Arrs::Array{ChainRulesCore.Tangent{Any, Tuple{…}}, 3})
@ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:451
[11] #tmap#102
@ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:373 [inlined]
[12] tmap
@ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:337 [inlined]
[13] (::PEPSKit.var"#dtmap_pullback#18"{@Kwargs{}, Array{ChainRulesCore.ProjectTo{…}, 3}, typeof(identity), Array{Tuple{…}, 3}})(dy_raw::Array{ChainRulesCore.Tangent{Any, Tuple{…}}, 3})
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/utility/diffable_threads.jl:25
[14] (::Zygote.ZBack{PEPSKit.var"#dtmap_pullback#18"{@Kwargs{}, Array{…}, typeof(identity), Array{…}}})(dy::Array{Tuple{TensorMap{…}, TensorMap{…}}, 3})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:221
[15] simultaneous_projectors
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/ctmrg/simultaneous.jl:64 [inlined]
[16] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Tuple{Tuple{…}, Nothing})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[17] ctmrg_iteration
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/ctmrg/simultaneous.jl:37 [inlined]
[18] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Tuple{CTMRGEnv{…}, Nothing})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[19] f
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:271 [inlined]
[20] (::Zygote.Pullback{Tuple{PEPSKit.var"#f#400", InfinitePEPS{…}, CTMRGEnv{…}}, Any})(Δ::CTMRGEnv{TensorMap{ComplexF64, ComplexSpace, 1, 1, Vector{…}}, TensorMap{ComplexF64, ComplexSpace, 3, 1, Vector{…}}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[21] (::Zygote.var"#ad_pullback#63"{Tuple{PEPSKit.var"#f#400", InfinitePEPS{…}, CTMRGEnv{…}}, Zygote.Pullback{Tuple{…}, Any}})(Δ::CTMRGEnv{TensorMap{ComplexF64, ComplexSpace, 1, 1, Vector{…}}, TensorMap{ComplexF64, ComplexSpace, 3, 1, Vector{…}}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:273
[22] (::PEPSKit.var"#∂f∂x#402"{Zygote.var"#ad_pullback#63"{Tuple{…}, Zygote.Pullback{…}}, CTMRGEnv{TensorMap{…}, TensorMap{…}}})(x::CTMRGEnv{TensorMap{ComplexF64, ComplexSpace, 1, 1, Vector{…}}, TensorMap{ComplexF64, ComplexSpace, 3, 1, Vector{…}}})
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:276
[23] apply(f::PEPSKit.var"#∂f∂x#402"{Zygote.var"#ad_pullback#63"{Tuple{…}, Zygote.Pullback{…}}, CTMRGEnv{TensorMap{…}, TensorMap{…}}}, x::CTMRGEnv{TensorMap{ComplexF64, ComplexSpace, 1, 1, Vector{…}}, TensorMap{ComplexF64, ComplexSpace, 3, 1, Vector{…}}})
@ KrylovKit ~/.julia/packages/KrylovKit/SNKgC/src/apply.jl:2
[24] apply(A::Function, x::KrylovKit.InnerProductVec{typeof(KrylovKit._realinner), CTMRGEnv{TensorMap{ComplexF64, ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorMap{ComplexF64, ComplexSpace, 3, 1, Vector{ComplexF64}}}})
@ KrylovKit ~/.julia/packages/KrylovKit/SNKgC/src/KrylovKit.jl:251
[25] linsolve(operator::Function, b::KrylovKit.InnerProductVec{…}, x₀::KrylovKit.InnerProductVec{…}, alg::KrylovKit.BiCGStab{…}, a₀::Int64, a₁::Int64; alg_rrule::KrylovKit.BiCGStab{…})
@ KrylovKit ~/.julia/packages/KrylovKit/SNKgC/src/linsolve/bicgstab.jl:3
[26] linsolve
@ ~/.julia/packages/KrylovKit/SNKgC/src/linsolve/bicgstab.jl:1 [inlined]
[27] reallinsolve(f::Function, b::CTMRGEnv{TensorMap{…}, TensorMap{…}}, x₀::CTMRGEnv{TensorMap{…}, TensorMap{…}}, alg::KrylovKit.BiCGStab{Float64}, a₀::Int64, a₁::Int64)
@ KrylovKit ~/.julia/packages/KrylovKit/SNKgC/src/linsolve/linsolve.jl:249
[28] fpgrad(∂F∂x::CTMRGEnv{TensorMap{…}, TensorMap{…}}, ∂f∂x::Function, ∂f∂A::PEPSKit.var"#∂f∂A#401"{Zygote.var"#ad_pullback#63"{…}, InfinitePEPS{…}}, y₀::CTMRGEnv{TensorMap{…}, TensorMap{…}}, alg::LinSolver{:fixed})
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:350
[29] leading_boundary_fixed_pullback
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:277 [inlined]
[30] #36
@ ~/.julia/packages/PEPSKit/WLQD9/src/utility/hook_pullback.jl:34 [inlined]
[31] ZBack
@ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:221 [inlined]
[32] (::Zygote.var"#kw_zpullback#58"{PEPSKit.var"#36#37"{PEPSKit.var"#leading_boundary_fixed_pullback#399"{…}}})(dy::CTMRGEnv{TensorMap{ComplexF64, ComplexSpace, 1, 1, Vector{…}}, TensorMap{ComplexF64, ComplexSpace, 3, 1, Vector{…}}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:247
[33] #389
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:171 [inlined]
[34] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Float64)
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[35] (::Zygote.var"#88#89"{Zygote.Pullback{Tuple{…}, Tuple{…}}})(Δ::Float64)
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface.jl:97
[36] withgradient(f::Function, args::InfinitePEPS{TensorMap{ComplexF64, ComplexSpace, 1, 4, Vector{ComplexF64}}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface.jl:219
[37] (::PEPSKit.var"#388#392"{LocalOperator{Tuple{…}, ComplexSpace}, PEPSOptimize{LinSolver{…}}})(::Tuple{InfinitePEPS{TensorMap{…}}, CTMRGEnv{TensorMap{…}, TensorMap{…}}})
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:170
[38] optimize(fg::PEPSKit.var"#388#392"{…}, x::Tuple{…}, alg::LBFGS{…}; precondition::typeof(OptimKit._precondition), finalize!::typeof(OptimKit._finalize!), retract::Function, inner::typeof(PEPSKit.real_inner), transport!::typeof(OptimKit._transport!), scale!::typeof(OptimKit._scale!), add!::typeof(OptimKit._add!), isometrictransport::Bool)
@ OptimKit ~/.julia/packages/OptimKit/xpmbV/src/lbfgs.jl:21
[39] fixedpoint(ψ₀::InfinitePEPS{TensorMap{…}}, H::LocalOperator{Tuple{…}, ComplexSpace}, alg::PEPSOptimize{LinSolver{…}}, env₀::CTMRGEnv{TensorMap{…}, TensorMap{…}}; finalize!::Function, symmetrization::Nothing)
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:167
[40] fixedpoint(ψ₀::InfinitePEPS{TensorMap{…}}, H::LocalOperator{Tuple{…}, ComplexSpace}, alg::PEPSOptimize{LinSolver{…}}, env₀::CTMRGEnv{TensorMap{…}, TensorMap{…}})
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/peps_opt.jl:144
[41] top-level scope
@ ~/git/PEPSKit.jl/dev/svdsolve_rrule_bug.jl:31
[42] include(fname::String)
@ Base.MainInclude ./client.jl:489
[43] top-level scope
@ REPL[1]:1
nested task error: BoundsError: attempt to access 15-element Vector{Tuple{Vector{ComplexF64}, Vector{ComplexF64}, Vector{ComplexF64}}} at index [16]
Stacktrace:
[1] getindex
@ ./essentials.jl:13 [inlined]
[2] compute_svdsolve_pullback_data(Δvals::SubArray{…}, Δlvecs::Vector{…}, Δrvecs::Vector{…}, vals::SubArray{…}, lvecs::Vector{…}, rvecs::Vector{…}, info::KrylovKit.ConvergenceInfo{…}, f::Base.ReshapedArray{…}, which::Symbol, alg_primal::KrylovKit.GKL{…}, alg_rrule::KrylovKit.Arnoldi{…})
@ KrylovKitChainRulesCoreExt ~/.julia/packages/KrylovKit/SNKgC/ext/KrylovKitChainRulesCoreExt/svdsolve.jl:239
[3] (::PEPSKit.var"#tsvd!_itersvd_pullback#29"{…})(ΔUSVϵ::ChainRulesCore.Tangent{…})
@ PEPSKit ~/.julia/packages/PEPSKit/WLQD9/src/utility/svd.jl:166
[4] ZBack
@ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:221 [inlined]
[5] (::Zygote.var"#kw_zpullback#58"{…})(dy::Tuple{…})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:247
[6] compute_projector
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/ctmrg/projectors.jl:69 [inlined]
[7] (::Zygote.Pullback{Tuple{…}, Any})(Δ::Tuple{Tuple{…}, @NamedTuple{…}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[8] simultaneous_projectors
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/ctmrg/simultaneous.jl:86 [inlined]
[9] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Tuple{Tuple{…}, @NamedTuple{…}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[10] #294
@ ~/.julia/packages/PEPSKit/WLQD9/src/algorithms/ctmrg/simultaneous.jl:67 [inlined]
[11] (::Zygote.Pullback{Tuple{…}, Tuple{…}})(Δ::Tuple{TensorMap{…}, TensorMap{…}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
[12] (::Zygote.var"#ad_pullback#63"{Tuple{…}, Zygote.Pullback{…}})(Δ::ChainRulesCore.Tangent{Any, Tuple{…}})
@ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:273
[13] #15
@ ~/.julia/packages/PEPSKit/WLQD9/src/utility/diffable_threads.jl:26 [inlined]
[14] #4
@ ./generator.jl:36 [inlined]
[15] iterate
@ ./generator.jl:47 [inlined]
[16] collect(itr::Base.Generator{Base.Iterators.Zip{Tuple{SubArray{…}, SubArray{…}}}, Base.var"#4#5"{PEPSKit.var"#15#19"}})
@ Base ./array.jl:834
[17] map
@ ./abstractarray.jl:3409 [inlined]
[18] (::OhMyThreads.Implementation.var"#132#135"{PEPSKit.var"#15#19", Tuple{Array{…}, Array{…}}})(inds::UnitRange{Int64})
@ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:448
[19] mapreduce_first
@ ./reduce.jl:424 [inlined]
[20] _mapreduce(f::OhMyThreads.Implementation.var"#132#135"{…}, op::typeof(BangBang.append!!), ::IndexLinear, A::SubArray{…})
@ Base ./reduce.jl:435
[21] _mapreduce_dim
@ ./reducedim.jl:365 [inlined]
[22] mapreduce
@ ./reducedim.jl:357 [inlined]
[23] #27
@ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:110 [inlined]
[24] (::OhMyThreads.Implementation.var"#28#36"{StableTasks.AtomicRef{…}, OhMyThreads.Implementation.var"#27#35"{…}})()
@ OhMyThreads.Implementation ~/.julia/packages/StableTasks/3CrzR/src/internals.jl:67
in expression starting at /home/leburgel/git/PEPSKit.jl/dev/svdsolve_rrule_bug.jl:31
Some type information was truncated. Use `show(err)` to see complete types.
Package and version info
(dev) pkg> status
Status `~/git/PEPSKit.jl/dev/Project.toml`
[0b1a1467] KrylovKit v0.9.3
⌅ [77e91f04] OptimKit v0.3.1
[52969e89] PEPSKit v0.3.0 `https://github.com/QuantumKitHub/PEPSKit.jl#4de9e09`
[07d1fe3e] TensorKit v0.14.3
[9a3f8284] Random
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)
This keeps popping up here and there, and becomes significantly worse when working with symmetries with small blocks. Do you think there is a sensible fix for this @Jutho, or does this really mean something is going fundamentally wrong in the simulations that we should safeguard against?
I just fixed the issue by setting a small tolerance (1.0e-10) in Arnoldi algorithm occurred in my case.
So it seems that at least in some cases, this issue is caused by the fact that the magnitude of the smallest remaining singular value lies below the tolerance of the rrule_alg, and this causes things to go wrong. A straightforward attempted fix would then be to determine the rrule_alg tolerance dynamically based on the magnitude of the smallest singular value (just @set it to the minimum of the provided value and the smallest singular value divided by 10 or something?). This might not fix all occurrences (degeneracies remain a mystery I guess), but at least it sounds like a sensible thing to do regardless, at least intuitively.
What do you think @Jutho?
Yes; I do not feel great deviating from the arguments that were entered, but I guess there is (currently) no way to replicate such dynamic behavior in any other way. Is this still in the case where you actually use the KrylovKit rrule for a truncated SVD obtained from a full diagonalization?
Is this still in the case where you actually use the KrylovKit
rrulefor a truncated SVD obtained from a full diagonalization?
Yes, I think all the occurrences of this issue in PEPSKit.jl are within that specific context. We should be able to circumvent this soon once QuantumKitHub/PEPSKit.jl#150 is merged, but it still seems worth it to address this issue here. Currently there's just not many people using this so it's hard to tell if this issue would come up in other contexts, but in principle I don't see why it wouldn't.
I do not feel great deviating from the arguments that were entered, but I guess there is (currently) no way to replicate such dynamic behavior in any other way.
I guess in spirit it's kind of a similar issue as when svdsolve is called with a krylovdim smaller than howmany, in the sense that the user-supplied arguments are just really not set up to even be able to solve the problem in a sensible way. I think now KrylovKit.jl just throws an error in this case, but in principle it could just as well use some heuristic to increase the krylovdim to an actually sensible value.
If you prefer to throw an error in case the rrule tolerance is larger than the magnitude of the smallest singular value to enforce that the arguments that were entered actually make sense, I think even that would be better than the kind of silent failures we're getting now.
I am still a bit confused; given that you manually call KrylovKit's rrule, is it possible to set the tolerance after you already know S from within PEPSKit? I definitely agree that this will pop up elsewhere, although also the forward computation is ill-defined if singular values are smaller than tolerance.
I am still a bit confused; given that you manually call KrylovKit's rrule, is it possible to set the tolerance after you already know
Sfrom within PEPSKit? I definitely agree that this will pop up elsewhere, although also the forward computation is ill-defined if singular values are smaller than tolerance.
Yes, it's definitely possible to set the tolerance from within PEPSKit.jl. Part of the question (which may not have been clear from my explanation) is whether it would be better to circumvent this from PEPSKit.jl, or address it directly here. Since it seemed to me that this problem is more broad than just our use case I personally thought it might be good to address it here.
But indeed, while I keep kind of forgetting we're never going through the actual forward computation in our use case, it is definitely much more likely for the trouble to start in the forward computation in a normal use case. Given this, it does feel a bit strange to change to try to fix things in the backward pass when the forward was ill-defined.
If not intervening, does it make sense to put some warnings in place? Performing a tolerance versus smallest singular value magnitude check in the rrule and throwing a warning if the former is larger is simple enough I guess and might be helpful. Will this issue be caught by any of the current warnings in the forward computation?