ForwardDiff.jl icon indicating copy to clipboard operation
ForwardDiff.jl copied to clipboard

Derivative of matrix inverse for diagonal matrix is not correct?

Open wwang2 opened this issue 4 years ago • 1 comments

I am here to reporting some dubious results from the derivative of a matrix inversion for the diagonal matrix.

Package version: [f6369f11] ForwardDiff v0.10.14

A = [0.5 0 0 ; 0 0.5 0; 0 0 0.5]

ForwardDiff.gradient(A -> sum(inv(A)), A)

result:

3×3 Array{Float64,2}:    
 -4.0  -4.0  -4.0
  0.0  -4.0  -4.0
  0.0   0.0  -4.0

However, the analytical result should be:

-inv(A) * ones(3,3) * inv(A)

result:

3×3 Array{Float64,2}:
 -4.0  -4.0  -4.0
 -4.0  -4.0  -4.0
 -4.0  -4.0  -4.0

The Zygote gradient function

Zygote.gradient(A -> sum(inv(A)), A)[1]

result:

3×3 Array{Float64,2}:
 -4.0  -4.0  -4.0
 -4.0  -4.0  -4.0
 -4.0  -4.0  -4.0

wwang2 avatar Dec 28 '20 18:12 wwang2

This is an unintended consequence of the polyalgorithm I wrote many years ago in Julia's LinearAlgebra module, https://github.com/JuliaLang/julia/blob/8e0183f2b66b5578d897a2c8318a63667a27fb8a/stdlib/LinearAlgebra/src/dense.jl#L803-L816. The matrix is detected to be triangular and then inverted as an UpperTriangular matrix which avoids the matrix factorization. I think https://github.com/JuliaDiff/ForwardDiff.jl/issues/480 would fix this issue generally since istriu would then no longer be true. Short term, you can work around the issue by calling lu directly to bypass the polyalgorithm, i.e.

julia> ForwardDiff.gradient(t -> sum(inv(lu(t))), A)
3×3 Matrix{Float64}:
 -4.0  -4.0  -4.0
 -4.0  -4.0  -4.0
 -4.0  -4.0  -4.0

andreasnoack avatar Dec 28 '20 21:12 andreasnoack