DistributionsAD.jl
DistributionsAD.jl copied to clipboard
Need for MvNormal with sparse precision matrix
After much digging and testing, it appears you are currently not supporting sparse matrix constructions for AD. Is this correct? In particular, I am in need of a MvNormal distribution that is specified using a sparse precision matrix and will work with the autodiff in Turing. This is common for large spatial problems using Gaussian Markov random fields. I came to Turing for this because stan does not support sparse matrix operations. This would be a really valuable addition to your package if you can make it work.
No, MvNormalCanon
(which you are referring to) is not supported yet. Would be a valuable addition.
One that allows sparse precision matrix is the key.
Sure, I understand, I was just referring to the fact that MvNormalCanon
is the distribution type in Distributions for multivariate normal distributions that are parameterized by precision matrices.
Yes, thanks for pointing that out. Is there any reason that sparse operations would cause problems for autodiff?
This is also something I would find really valuable. From trying out a toy model, am I correct that Distributions is using PDMats which, when it constructs a PDSparseMat
, calls a cholesky
routine from CHOLMOD, which chokes on the Dual numbers?
using Turing
using SparseArrays
using PDMats
@model function MvNormalCanonTest(x)
n = length(x)
ρ ~ Uniform(0, 0.49)
Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
x ~ MvNormalCanon(PDSparseMat(Q))
end
x = randn(10)
m = MvNormalCanonTest(x)
sample(m, NUTS(), 100)
TypeError: in Sparse, in Tv, expected Tv<:Union{Complex{Float64}, Float64}, got Type{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1}}
SuiteSparse.CHOLMOD.Sparse(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}, ::Int64) at cholmod.jl:933
SuiteSparse.CHOLMOD.Sparse(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}) at cholmod.jl:939
cholesky(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}; kws::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at cholmod.jl:1458
cholesky at cholmod.jl:1458 [inlined]
PDSparseMat(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}) at pdsparsemat.jl:22
macro expansion at untitled-d3c70e9b18556a75b02f77b5bb73e87c:245 [inlined]
##evaluator#299(::Random._GLOBAL_RNG, ::DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}}, ::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f...
This PR is probably relevant: JuliaDiff/ChainRules.jl#246. If I understand the AD machinery being used here (I don't), once sparse array support is added, rules for sparse Choleskys can be added that will enable differentiating through. Unfortunately CHOLMOD doesn't know what Dual
numbers are, so ForwardDiff
AD fails.
Reading the stacktrace for my error above more carefully, I the error is actually coming from the Sparse
constructor, not the Cholesky decomposition, so hopefully https://github.com/JuliaDiff/ChainRules.jl/pull/246 will solve the problem.
I've done more investigating and think that it might be easier to implement a custom TuringMvNormalCanon
type in this package, at least as long as Distributions
uses PDMats
. The issue is that even with a custom adjoint defined for the Cholesky decomposition, the rrule
still has to calculate the factorization--which, as long as it is done via SuiteSparse.CHOLMOD
, means it won't be able to handle dual numbers.
However, I recently found LDLFactorizations.jl, a pure-Julia port of the LDL decomposition from SuiteSparse--which can be used to calculate the Cholesky factors, even for non-float matrices. Using this approach, I've hacked together a custom MvNormalCanon
implementation that works in my example Turing model above. @devmotion , does this sound like a worthwhile PR?
The issue is that even with a custom adjoint defined for the Cholesky decomposition, the rrule still has to calculate the factorization--which, as long as it is done via SuiteSparse.CHOLMOD, means it won't be able to handle dual numbers.
I don't understand why there is a problem with the custom adjoint here - what is missing or broken in the definition of the adjoint in ChainRules (https://github.com/JuliaDiff/ChainRules.jl/blob/f63f385e7a8477c5112f0b2cf24790ca1abf6320/src/rulesets/LinearAlgebra/factorization.jl#L73)?
However, I recently found LDLFactorizations.jl, a pure-Julia port of the LDL decomposition from SuiteSparse--which can be used to calculate the Cholesky factors, even for non-float matrices.
Interesting, I didn't know this package. I wonder if it is problematic that it uses the LGPL license.
In my example above, the precision matrix Q
is defined in terms of the unknown parameter ρ
, so when applying any autodiff routine to the model, Q
ends up being a SparseMatrixCSC
with elements of some dual number type. The error is coming from the PDSparseMat
constructor trying to precalculate cholesky(Q)
using CHOLMOD, not the actual calculation of the likelihood or its derivative.
The rrule
seems to be defined correctly, but it still calls CHOLMOD if the matrix X
is sparse (line 74 in the file you linked). My understanding of how rrule
s work is pretty superficial, so I may be wrong, but I think it would run into the same problem there, right?
Regarding the LGPL...I'm not a lawyer, but I believe it's okay to use LGPL libraries in non-GPL code, as long as the non-GPL code isn't a derivative work. SuiteSparse itself is under the LGPL, so if using LDLFactorizations causes license issues, they're at least not new ones...
Ah OK, I think I understand now. I'm a bit surprised that ForwardDiff doesn't support cholesky
of sparse matrices but I've never checked.
Did you try to use Zygote instead of ForwardDiff? It uses the adjoints in ChainRules and does not operate with dual numbers.
Zygote doesn't work either, it fails with a Mutating arrays is not supported
error. Some of the linear algebra functions build up their results in-place.
Traceback
```julia Mutating arrays is not supported error(::String) at error.jl:33 (::Zygote.var"#364#365")(::Nothing) at array.jl:58 (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing) at adjoint.jl:49 diag at cholmod.jl:1792 [inlined] (::typeof(∂(diag)))(::Array{Float64,1}) at interface2.jl:0 logdet at cholmod.jl:1801 [inlined] (::typeof(∂(logdet)))(::Float64) at interface2.jl:0 logdet at pdsparsemat.jl:55 [inlined] (::typeof(∂(logdet)))(::Float64) at interface2.jl:0 logdetcov at mvnormalcanon.jl:165 [inlined] (::typeof(∂(logdetcov)))(::Float64) at interface2.jl:0 mvnormal_c0 at mvnormal.jl:99 [inlined] (::typeof(∂(mvnormal_c0)))(::Int64) at interface2.jl:0 _logpdf at mvnormal.jl:127 [inlined] (::typeof(∂(_logpdf)))(::Int64) at interface2.jl:0 #119 at multivariates.jl:262 [inlined] (::typeof(∂(λ)))(::Int64) at interface2.jl:0 #1071 at broadcast.jl:140 [inlined] #3 at generator.jl:36 [inlined] iterate at generator.jl:47 [inlined] collect(::Base.Generator{Base.Iterators.Zip{Tuple{Array{typeof(∂(λ)),1},FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}}},Base.var"#3#4"{Zygote.var"#1071#1078"}}) at array.jl:686 map at abstractarray.jl:2248 [inlined] (::Zygote.var"#1070#1077"{Tuple{UnitRange{Int64}},Val{2},Array{typeof(∂(λ)),1}})(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at broadcast.jl:140 #3862#back at adjoint.jl:49 [inlined] (::Zygote.var"#150#151"{Zygote.var"#3862#back#1081"{Zygote.var"#1070#1077"{Tuple{UnitRange{Int64}},Val{2},Array{typeof(∂(λ)),1}}},Tuple{Tuple{Nothing,Nothing,Nothing},Tuple{}}})(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at lib.jl:191 #1693#back at adjoint.jl:49 [inlined] broadcasted at broadcast.jl:1257 [inlined] (::typeof(∂(broadcasted)))(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at interface2.jl:0 #556 at array.jl:243 [inlined] (::typeof(∂(#556)))(::Int64) at interface2.jl:0 #41 at interface.jl:45 [inlined] #557 at array.jl:244 [inlined] sum at reducedim.jl:720 [inlined] (::typeof(∂(sum)))(::Int64) at interface2.jl:0 loglikelihood at multivariates.jl:262 [inlined] (::typeof(∂(loglikelihood)))(::Int64) at interface2.jl:0 observe at context_implementations.jl:152 [inlined] observe at hmc.jl:557 [inlined] _tilde at context_implementations.jl:109 [inlined] tilde at context_implementations.jl:67 [inlined] (::typeof(∂(tilde)))(::Int64) at interface2.jl:0 tilde_observe at context_implementations.jl:89 [inlined] (::typeof(∂(tilde_observe)))(::Nothing) at interface2.jl:0 #7 at untitled-80c47956c154a0947dfbe4a5c1a877ba:92 [inlined] (::typeof(∂(#7)))(::Nothing) at interface2.jl:0 macro expansion at model.jl:0 [inlined] _evaluate at model.jl:160 [inlined] (::typeof(∂(_evaluate)))(::Nothing) at interface2.jl:0 evaluate_threadsafe at model.jl:150 [inlined] (::typeof(∂(evaluate_threadsafe)))(::Nothing) at interface2.jl:0 Model at model.jl:94 [inlined] (::typeof(∂(λ)))(::Nothing) at interface2.jl:0 #150 at lib.jl:191 [inlined] #1693#back at adjoint.jl:49 [inlined] Model at model.jl:98 [inlined] f at ad.jl:165 [inlined] (::typeof(∂(λ)))(::Int64) at interface2.jl:0 (::Zygote.var"#41#42"{typeof(∂(λ))})(::Int64) at interface.jl:45 gradient_logp(::Turing.Core.ZygoteAD, ::Array{Float64,1}, ::DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}, ::DynamicPPL.Model{var"#7#8",(:x,),(),(),Tuple{Array{Float64,1}},Tuple{}}, ::DynamicPPL.Sampler{NUTS{Turing.Core.ZygoteAD,(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}, ::DynamicPPL.DefaultContext) at ad.jl:171 gradient_logp at ad.jl:83 [inlined] gradient_logp at ad.jl:83 [inlined] ∂logπ∂θ at hmc.jl:474 [inlined] ∂H∂θ at hamiltonian.jl:31 [inlined] phasepoint at hamiltonian.jl:69 [inlined] find_good_stepsize(::Random._GLOBAL_RNG, ::AdvancedHMC.Hamiltonian{AdvancedHMC.DiagEuclideanMetric{Float64,Array{Float64,1}},Turing.Inference.var"#logπ#55"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Sampler{NUTS{Turing.Core.ZygoteAD,(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.Model{var"#7#8",(:x,),(),(),Tuple{Array{Float64,1}},Tuple{}}},Turing.Inference.var"#∂logπ∂θ#54"{DynamicPPL.VarI... ```
Sorry, it's not clear to me what exactly you tested here. Is it the Turing model above? That might still require some additional work but I was just referring to the statements about AD of cholesky
above - I assumed that this should work with sparse matrices since Zygote should use the custom adjoint defined in ChainRules.
Yes I just ran the Turing model above after setting the AD backend to Zygote...sorry for not being clearer. However, I also get the same error with a hand-coded likelihood function :
using Distributions, Zygote, SparseArrays, PDMats
const xobs = randn(10)
function loglik(θ)
ρ = 0.5tanh(θ[1])
n = length(xobs)
Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
d = MvNormalCanon(PDSparseMat(Q))
return logpdf(d, xobs)
end
Zygote.gradient(loglik, [1.0])
# ERROR: Mutating arrays is not supported
For what it's worth, the pullback
and pushforward
of cholesky
with a sparse argument appears to work:
julia> n = 100
julia> ρ = 0.5
julia> Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
julia> pullback(cholesky, Q)
(SuiteSparse.CHOLMOD.Factor{Float64}
type: LLt
method: simplicial
maxnnz: 19
nnz: 19
success: true
, Zygote.var"#41#42"{Zygote.ZBack{ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}}}(Zygote.ZBack{ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}}(ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}(SuiteSparse.CHOLMOD.Factor{Float64}
type: LLt
method: simplicial
maxnnz: 19
nnz: 19
success: true
))))
julia> pushforward(cholesky, Q)
#8 (generic function with 1 method)
But trying to get the gradient of logdet
with respect to a sparse matrix fails:
julia> Zygote.gradient(logdet, Q)
ERROR: MethodError: no method matching logabsdet(::SuiteSparse.UMFPACK.UmfpackLU{Float64,Int64})
Closest candidates are:
logabsdet(::SymTridiagonal; shift) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/tridiag.jl:441
logabsdet(::UnitUpperTriangular{T,S} where S<:AbstractArray{T,2}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2652
logabsdet(::UnitLowerTriangular{T,S} where S<:AbstractArray{T,2}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2653
...
Stacktrace:
[1] logabsdet(::SparseMatrixCSC{Float64,Int64}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/generic.jl:1573
[2] adjoint at /home/jkbest/.julia/packages/Zygote/c0awc/src/lib/array.jl:378 [inlined]
[3] _pullback at /home/jkbest/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
[4] logdet at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/generic.jl:1596 [inlined]
[5] _pullback(::Zygote.Context, ::typeof(logdet), ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
[6] _pullback(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:38
[7] pullback(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:44
[8] gradient(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:53
[9] top-level scope at REPL[22]:1
And if you use a function to create a AR-1 precision matrix and try to take logdet
with respect to ρ
, you get the same error about mutating arrays.
julia> spchol(ρ; n = 100) = cholesky(spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1)))
julia> Zygote.gradient(ρ -> logdet(spchol(ρ)), 0.5)
ERROR: Mutating arrays is not supported
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] (::Zygote.var"#364#365")(::Nothing) at /home/jkbest/.julia/packages/Zygote/c0awc/src/lib/array.jl:58
[3] (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing) at /home/jkbest/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49
[4] diag at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SuiteSparse/src/cholmod.jl:1792 [inlined]
[5] (::typeof(∂(diag)))(::Array{Float64,1}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
[6] logdet at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SuiteSparse/src/cholmod.jl:1801 [inlined]
[7] (::typeof(∂(logdet)))(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
[8] #4 at ./REPL[20]:1 [inlined]
[9] (::typeof(∂(#4)))(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
[10] (::Zygote.var"#41#42"{typeof(∂(#4))})(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:45
[11] gradient(::Function, ::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:54
[12] top-level scope at REPL[20]:1
I'm not sure if any of this is helpful, but maybe a Zygote issue would be more productive?
For what it's worth, the pullback and pushforward of cholesky with a sparse argument appears to work:
That's exactly what I assumed and why I thought that for Zygote the problem mentioned above for ForwardDiff
The error is coming from the PDSparseMat constructor trying to precalculate cholesky(Q) using CHOLMOD, not the actual calculation of the likelihood or its derivative.
should not exist.
But trying to get the gradient of logdet with respect to a sparse matrix fails:
It seems logabsdet
is not defined for sparse matrices (and its LU decomposition)?!
It seems logabsdet is not defined for sparse matrices (and its LU decomposition)?!
Weird that we'd be the first people to notice this, but that seems to be the case. There are a few closed issues for other missing logabsdet
methods, but not for SparseMatrixCSC
. I've opened an issue about it: https://github.com/JuliaLang/julia/issues/38463
Anyway, in the meantime...what do you think is the best way to move forward on this issue?
Some progress on this: logabsdet
will now work for sparse matrices (on Julia master, should be released in v1.7). Once the AD issues get sorted out, this issue can probably be closed.