ChainRulesCore.jl
ChainRulesCore.jl copied to clipboard
ProjectTo(Adjoint{<:SparseMatrixCSC})(::Matrix) does not preserve sparsity
Consider:
julia> x = sprand(3,3, 0.2)
3×3 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
0.289391 ⋅ 0.43614
⋅ 0.427879 ⋅
⋅ ⋅ 0.332239
julia> xt = x'
3×3 adjoint(::SparseMatrixCSC{Float64, Int64}) with eltype Float64:
0.289391 0.0 0.0
0.0 0.427879 0.0
0.43614 0.0 0.332239
julia> ProjectTo(xt)(ones(3,3))
3×3 Matrix{Float64}:
1.0 1.0 1.0
1.0 1.0 1.0
1.0 1.0 1.0
cc @jieli-matrix
Thanks for your explanation for the root course~
Here's what projection stores for x:
julia> using SparseArrays, ChainRulesCore
julia> x = sprand(3,3, 0.2)
3×3 SparseMatrixCSC{Float64, Int64} with 3 stored entries:
0.356584 ⋅ ⋅
⋅ ⋅ ⋅
0.249112 0.257821 ⋅
julia> ProjectTo(x)
ProjectTo{SparseMatrixCSC}(element = ProjectTo{Float64}(), axes = (Base.OneTo(3), Base.OneTo(3)), rowval = [1, 3, 3], nzranges = UnitRange{Int64}[1:2, 3:3, 4:3], colptr = [1, 3, 4, 4])
One path here is to return another ProjectTo{SparseMatrixCSC} with different properties. With the adjoint, functions like rowvals(x') don't make sense, though. You can do this:
julia> rowvals(sparse(x'))
3-element Vector{Int64}:
1
1
2
and sparse(x') ultimately calls copy(x') which calls SparseArrays.ftranspose((x').parent, x -> adjoint(copy(x)), Float64). Presumably they have put some thought into those methods, maybe this is OK? Then it's one line:
julia> ChainRulesCore.ProjectTo(x::LinearAlgebra.AdjOrTrans{<:Any,<:SparseMatrixCSC}) = ProjectTo(copy(x))
julia> ProjectTo(x')
ProjectTo{SparseMatrixCSC}(element = ProjectTo{Float64}(), axes = (Base.OneTo(3), Base.OneTo(3)), rowval = [1, 1, 2], nzranges = UnitRange{Int64}[1:1, 2:1, 2:3], colptr = [1, 2, 2, 4])
julia> ans(ones(3,3))
3×3 SparseMatrixCSC{Float64, Int64} with 3 stored entries:
1.0 ⋅ 1.0
⋅ ⋅ 1.0
⋅ ⋅ ⋅
The other possible path is to construct a different projector, which tries to return another Adjoint{SparseMatrixCSC} instead. This is what already happens for adjoint vectors, and perhaps it's slightly awkward that we've used up the name ProjectTo{Adjoint} for this purpose:
julia> y = sprand(10, 0.2)'
1×10 adjoint(::SparseVector{Float64, Int64}) with eltype Float64:
0.0 0.0 0.0 0.0 0.0 0.0 0.554374 0.861901 0.0 0.0
julia> ProjectTo(y)
ProjectTo{Adjoint}(parent = ProjectTo{SparseVector}(element = ProjectTo{Float64}(), nzind = [7, 8], axes = (Base.OneTo(10),)),)
julia> ans(ones(1,10))
1×10 adjoint(::SparseVector{Float64, Int64}) with eltype Float64:
0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0
Perhaps this could be more efficient than the first way. However, all of these sparse projections are pretty inefficient right now. They are more or less proof-of-concept implementations, awaiting someone with interest (and a use case!) to optimise them.