lapack exception (2) backward qr diff
While random matrices work, I get lapack exceptions in trtrs when using it in my application. fpeps.zip
`using Zygote,BackwardsLinalg,fpeps
width = 4;height = 4; D = 2;d = 2;chi = 5;
peps = Array{Array{ComplexF64,5},2}(undef,width,height); for i = 1:width for j = 1:height peps[i,j] = rand(ComplexF64,D,D,D,D,d); end end
boundaries = fpeps.gen_boundaries(peps,chi); n1 = boundaries[1][1]; n2 = boundaries[1][2];
fun(a) = real(sum(sum.(fpeps.north_vomps(n1,peps[1,:],a)))) fun'(n2)`
Is there anything I should know about numerical stability problems (assuming the lapack error means that trtrs didn't converge). How can I go about debugging this further? The matrix passed to trtrs did became spectacularly quite ill-conditioned in my example... I can try to reduce this further to a smaller minimal example?
If I set the parameters to make sure that we are effectively working with 1x1 matrices then I get errors later on with zygote (I'm assuming it's unrelated to this package but I'm not sure as the stacktrace effectively says nothing).
Ok so the problem is that my R is rank-defficient, and the inverse (which is done using trtrs) fails. I made sure my R's are now full rank and everything appears to work (as in, I now get different errors probably unrelated to backwardslinalg)
Thanks for you feedback. Yes, that happens sometime. LinearAlgebra backward functions have the problem of exploding gradients.
Now I am finding an approach to solve this problem one for all. The instruction level automatic differentiation on a reversible eDSL NiLang. Here is the code for a naive implementation of QR (without re-orthogonalization)
https://github.com/GiggleLiu/NiLang.jl/blob/master/project/qr.jl
This is in research stage and is not ready for productivity (it does not use BLAS!). Hopefully it can solve the gradient exploding problem in the future.
LinearAlgebra backward functions have the problem of exploding gradients.
When do they have this problem? I know svd has it, but I assumed that the backward differentiation of qr was stable (because the one inverse step is well behaved as long as R is well behaved).
How does nilang address this issue? I thought the exploding gradients had their origin in the inversion steps and I don't understand how you can prevent this.
svd, symeig have the spectrum degeneracy problem. QR (especially for rectangular matrices) has the rank deficiency problem in this implementation.
NiLang differentiate basic components, the floating point operations, rather than using manually derived formulas.
It depends on how forward program works. Like the Jacobi method in https://web.stanford.edu/class/cme335/lecture7.pdf Differentiating each instruction faithfully does not have a know caveat for now. Not sure it can solve the problem.