ImplicitDifferentiation.jl icon indicating copy to clipboard operation
ImplicitDifferentiation.jl copied to clipboard

Support for sparse arrays

Open gdalle opened this issue 3 years ago • 21 comments

At the moment, sparse arrays would be densified when turned to vectors. It would be nice to be agnostic to the array type ping @mfherbst

gdalle avatar Sep 20 '22 16:09 gdalle

Looks like it could be done with KrylovKit.jl, perhaps at the price of efficiency (no in-place version of the solver)

gdalle avatar Feb 18 '23 07:02 gdalle

Since we have no in-place differentiation (#48) and it does not look likely to happen soon, maybe it would make sense to switch to KrylovKit.jl for this reason? @mohamed82008

gdalle avatar May 24 '23 14:05 gdalle

Let's think some more about getting AD.jl to work. We need API suggestions and a PR. I am not opposed to the idea in principle but just need to be convinced that this can give us significant performance gains.

mohdibntarek avatar May 24 '23 16:05 mohdibntarek

Ok, labelling this as low priority

gdalle avatar May 30 '23 08:05 gdalle

I suggest we close this and open issues for specific array types, e.g sparse arrays when a use case arises.

mohdibntarek avatar Jul 30 '23 13:07 mohdibntarek

I think @thorek1 had a use case

gdalle avatar Jul 30 '23 13:07 gdalle

indeed, and I solved it in a tailored rewrite of the ImplicitFunctions.jl way of doing it (see lines 3000-3100 here). the problem with ForwarDiff.jl and SparseArrays is that value.(SparseVectorA) and partials.(SparseVectorA) return wrong results (I didn't test sparse matrices). you need to reimplement those two and refactor some code downstream to take advantage of the sparsity.

thorek1 avatar Jul 30 '23 21:07 thorek1

I don't think we're gonna go that far for sparse array support

gdalle avatar Jul 30 '23 22:07 gdalle

I think we need a more in-depth analysis of where exactly sparsity gets lost in ImplicitDifferentiation with an example. I suspect if sparsity is lost, it's mostly due to upstream packages, e.g ChainRules, not caring about sparsity. See the discussion in https://discourse.julialang.org/t/zygote-jl-how-to-get-the-gradient-of-sparse-matrix/59067. As of now, this issue is not actionable because there is no MWE, it's not specific to one array type and we don't know if we need to fix ImplicitDifferentiation or an upstream package to close this issue.

mohdibntarek avatar Jul 30 '23 23:07 mohdibntarek

@thorek1 following up on sparsity: if the AD backends destroy it, I don't think it's our job to fix it. But I'm happy to entertain a debate

gdalle avatar Aug 07 '23 16:08 gdalle

I don't think that's the case really. As in ForwardDiff with sparse matrices works but the way they are handled within ImplicitDifferentiation doesn't seem to work. I would lean towards improving on the way how it is handled internally to get it to work

thorek1 avatar Aug 07 '23 17:08 thorek1

Here's a very very failing example:

Setup

julia> using ForwardDiff

julia> using ImplicitDifferentiation

julia> using Zygote

julia> using SparseArrays

julia> forward(x) = sqrt.(x);

julia> conditions(x, y) = abs2.(y) .- x;

julia> implicit = ImplicitFunction(forward, conditions);

Sparse vector

julia> x = sprand(5, 0.3)
5-element SparseVector{Float64, Int64} with 2 stored entries:
  [4]  =  0.778476
  [5]  =  0.501691

julia> implicit(x)
5-element SparseVector{Float64, Int64} with 2 stored entries:
  [4]  =  0.882313
  [5]  =  0.708302

julia> Zygote.jacobian(implicit, x)[1]  # NaNs where there should be zeros
┌ Warning: IterativeLinearSolver failed, result contains NaNs
└ @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:32
...
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
  ⋅    ⋅    ⋅   NaN         NaN
  ⋅    ⋅    ⋅   NaN         NaN
  ⋅    ⋅    ⋅   NaN         NaN
  ⋅    ⋅    ⋅     0.566692     ⋅ 
  ⋅    ⋅    ⋅      ⋅          0.705914

julia> ForwardDiff.jacobian(implicit, x)  # zeros everywhere
5×5 Matrix{Float64}:
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0
 -0.0  -0.0  -0.0  -0.0  -0.0

Sparse matrix

julia> X = sprand(5, 5, 0.3)
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
 0.43017   0.100286   ⋅        0.436475   ⋅ 
  ⋅         ⋅        0.880371   ⋅         ⋅ 
 0.129674   ⋅         ⋅         ⋅         ⋅ 
  ⋅         ⋅        0.585982   ⋅        0.990759
  ⋅         ⋅        0.66023    ⋅         ⋅ 

julia> implicit(X)
5×5 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
 0.655873  0.31668   ⋅        0.660662   ⋅ 
  ⋅         ⋅       0.938281   ⋅         ⋅ 
 0.360102   ⋅        ⋅         ⋅         ⋅ 
  ⋅         ⋅       0.765495   ⋅        0.995369
  ⋅         ⋅       0.812545   ⋅         ⋅ 

julia> Zygote.jacobian(implicit, X)[1]  # way too dense
┌ Warning: IterativeLinearSolver failed, result contains NaNs
└ @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:32
...
25×25 SparseMatrixCSC{Float64, Int64} with 569 stored entries:
⎡⣻⣾⣗⣿⣿⣗⣗⣒⣿⣿⣿⣗⡇⎤
⎢⣽⣽⣿⣿⣿⣯⣯⣭⣿⣿⣿⣯⡇⎥
⎢⢿⢿⡿⣿⣿⣿⡿⠿⣿⣿⣿⡿⡇⎥
⎢⢹⢹⡏⣿⣿⡏⡟⢍⣿⣿⣿⡏⡇⎥
⎢⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡇⎥
⎢⢿⢿⡿⣿⣿⡿⡿⠿⣿⣿⣿⣿⡇⎥
⎣⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁⎦

julia> ForwardDiff.jacobian(implicit, X)  # error
ERROR: MethodError: no method matching Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}(::UndefInitializer, ::Int64)

Closest candidates are:
  (::Type{Base.ReshapedArray{T, N, P, MI}} where {T, N, P<:AbstractArray, MI<:Tuple{Vararg{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}})(::Any, ::Any, ::Any)
   @ Base reshapedarray.jl:6

Stacktrace:
  [1] Krylov.GmresSolver(m::Int64, n::Int64, memory::Int64, S::Type{Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}})
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/krylov_solvers.jl:1549
  [2] Krylov.GmresSolver(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}, memory::Int64)
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/krylov_solvers.jl:1567
  [3] gmres(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}; memory::Int64, M::LinearAlgebra.UniformScaling{Bool}, N::LinearAlgebra.UniformScaling{Bool}, ldiv::Bool, restart::Bool, reorthogonalization::Bool, atol::Float64, rtol::Float64, itmax::Int64, timemax::Float64, verbose::Int64, history::Bool, callback::Krylov.var"#164#171", iostream::Core.CoreSTDOUT)
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/gmres.jl:121
  [4] gmres(A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
    @ Krylov ~/.julia/packages/Krylov/Iuavd/src/gmres.jl:119
  [5] solve(#unused#::IterativeLinearSolver, A::LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, b::Base.ReshapedArray{Float64, 1, SparseMatrixCSC{Float64, Int64}, Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}})
    @ ImplicitDifferentiation ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/src/linear_solver.jl:27
  [6] (::ImplicitDifferentiationForwardDiffExt.var"#2#5"{Float64, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#5#7"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, LinearOperators.LinearOperator{Float64, Int64, ImplicitDifferentiation.PushforwardMul!{AbstractDifferentiationForwardDiffExt.var"#pushforward#7"{ImplicitDifferentiation.var"#4#6"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, SparseMatrixCSC{Float64, Int64}, Tuple{}}, Tuple{SparseMatrixCSC{Float64, Int64}}}, 2}, Nothing, Nothing, Vector{Float64}}, SparseMatrixCSC{Float64, Int64}})(k::Int64)
    @ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:39
  [7] macro expansion
    @ ./ntuple.jl:72 [inlined]
  [8] ntuple
    @ ./ntuple.jl:69 [inlined]
  [9] (::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing})(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:35
 [10] (::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing})(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64})
    @ ImplicitDifferentiationForwardDiffExt ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/ext/ImplicitDifferentiationForwardDiffExt.jl:25
 [11] chunk_mode_jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:183
 [12] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}}, ::Val{true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:23
 [13] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9, SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, Float64}, Float64, 9}, Int64}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:19
 [14] jacobian(f::ImplicitFunction{typeof(forward), typeof(conditions), IterativeLinearSolver, Nothing}, x::SparseMatrixCSC{Float64, Int64})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/jacobian.jl:19
 [15] top-level scope
    @ ~/Work/GitHub/Julia/ImplicitDifferentiation.jl/test/playground.jl:22

Correct results

julia> vecsqrt(x) = sqrt.(x);

julia> Zygote.jacobian(vecsqrt, x)[1]
5×5 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅   0.566692   ⋅ 
  ⋅    ⋅    ⋅    ⋅        0.705914

julia> ForwardDiff.jacobian(vecsqrt, x)
5×5 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅    ⋅         ⋅ 
  ⋅    ⋅    ⋅   0.566692   ⋅ 
  ⋅    ⋅    ⋅    ⋅        0.705914

julia> Zygote.jacobian(vecsqrt, X)[1]
25×25 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⎡⠁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠐⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠐⢄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎦

julia> ForwardDiff.jacobian(vecsqrt, X)
25×25 SparseMatrixCSC{Float64, Int64} with 8 stored entries:
⎡⠁⠄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠐⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠐⢄⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎦

gdalle avatar Aug 08 '23 07:08 gdalle

Exactly!

The workaround I used to get it to work in the sparse vector + ForwardDiff case was to extract the values and partials in a different way:


function separate_values_and_partials_from_sparsevec_dual(V::SparseVector{ℱ.Dual{Z,S,N}}; tol::AbstractFloat = eps()) where {Z,S,N}
    nrows = length(V)
    ncols = length(V.nzval[1].partials)

    rows = Int[]
    cols = Int[]

    prtls = Float64[]

    for (i,v) in enumerate(V.nzind)
        for (k,w) in enumerate(V.nzval[i].partials)
            if abs(w) > tol
                push!(rows,v)
                push!(cols,k)
                push!(prtls,w)
            end
        end
    end

    vvals = sparsevec(V.nzind,[i.value for i in V.nzval],nrows)
    ps = sparse(rows,cols,prtls,nrows,ncols)

    return vvals, ps
end

thorek1 avatar Aug 08 '23 08:08 thorek1

I tried to play around to at least fix the broken results even if they're dense, but I ran into https://github.com/JuliaDiff/ForwardDiff.jl/issues/658

If you have any ideas...

gdalle avatar Aug 08 '23 10:08 gdalle

I don't think that's relevant in the ImplicitDiff case. As in when I tested it against FiniteDifferences it worked out fine. But I might be overseeing something here (head is stuck on another problem right now).

thorek1 avatar Aug 08 '23 10:08 thorek1

It is relevant because the conditions are zero at optimality, so the sparse arrays cancel each other and we get wrong derivatives when constructing the operators A and B

gdalle avatar Aug 08 '23 10:08 gdalle

It's the reason why we get zeros everywhere with ForwardDiff in the sparse vector example above

gdalle avatar Aug 08 '23 10:08 gdalle

kk i see (now with a bit more brainpower :) ). then there might be a separate issue related to partials not being extracted properly in the implicitdiff code

thorek1 avatar Aug 08 '23 11:08 thorek1

Note to self: sparse arrays throw byproduct-related pullback errors while other arrays do not See https://github.com/JuliaDiff/ChainRules.jl/issues/731

julia> x = sparse(rand(Float32, 2, 3));

julia> forward(x) = sqrt.(x), 2;

julia> conditions(x, y, z) = abs.(y) .^ z .- x;

julia> implicit = ImplicitFunction(forward, conditions);

julia> Zygote.pullback(first ∘ implicit, x)
ERROR: MethodError: no method matching zero(::Tuple{Float32, Zygote.ZBack{ChainRules.var"#power_pullback#1338"{Float32, Int64, ProjectTo{Float64, @NamedTuple{}}, ProjectTo{Float32, @NamedTuple{}}, Float32}}})

gdalle avatar Aug 10 '23 12:08 gdalle

So I think ForwardDiff is never gonna work out of the box unless we copy @thorek1's rather tedious reshaping trick, but Zygote now works and returns sparse Jacobians as of #114

gdalle avatar Aug 10 '23 14:08 gdalle

Great stuff that zygote works. That would be the most efficient backend in the sparse part of my code anyway.

@forward diff: at least you get a chance to build a sparse pipeline with forwarddiff and implicitdiff.

For lack of less involved alternatives at the moment I am going for analytical implicit diff solutions in the sparse part but am happy to switch if possible (and fast enough).

thorek1 avatar Aug 10 '23 14:08 thorek1

The approach in v0.6.0 following #135 is to

  • support only vectors instead of higher order arrays (since we need to reshape anyway)
  • let users build sparse matrices inside the function

This saves many headaches and follows a lengthy Discourse thread where we found out that Enzyme does it just fine

https://discourse.julialang.org/t/how-do-you-speed-up-the-linear-sparse-solver-in-zygote/111801/24?u=gdalle

gdalle avatar Apr 10 '24 13:04 gdalle