ChainRulesCore.jl icon indicating copy to clipboard operation
ChainRulesCore.jl copied to clipboard

ProjectTo(Adjoint{<:SparseMatrixCSC})(::Matrix) does not preserve sparsity

Open oxinabox opened this issue 4 years ago • 2 comments

Consider:

julia> x = sprand(3,3, 0.2)
3×3 SparseMatrixCSC{Float64, Int64} with 4 stored entries:
 0.289391   ⋅        0.43614
  ⋅        0.427879   ⋅ 
  ⋅         ⋅        0.332239

julia> xt = x'
3×3 adjoint(::SparseMatrixCSC{Float64, Int64}) with eltype Float64:
 0.289391  0.0       0.0
 0.0       0.427879  0.0
 0.43614   0.0       0.332239

julia> ProjectTo(xt)(ones(3,3))
3×3 Matrix{Float64}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

cc @jieli-matrix

oxinabox avatar Aug 23 '21 12:08 oxinabox

Thanks for your explanation for the root course~

jieli-matrix avatar Aug 23 '21 12:08 jieli-matrix

Here's what projection stores for x:

julia> using SparseArrays, ChainRulesCore

julia> x = sprand(3,3, 0.2)
3×3 SparseMatrixCSC{Float64, Int64} with 3 stored entries:
 0.356584   ⋅         ⋅ 
  ⋅         ⋅         ⋅ 
 0.249112  0.257821   ⋅ 

julia> ProjectTo(x)
ProjectTo{SparseMatrixCSC}(element = ProjectTo{Float64}(), axes = (Base.OneTo(3), Base.OneTo(3)), rowval = [1, 3, 3], nzranges = UnitRange{Int64}[1:2, 3:3, 4:3], colptr = [1, 3, 4, 4])

One path here is to return another ProjectTo{SparseMatrixCSC} with different properties. With the adjoint, functions like rowvals(x') don't make sense, though. You can do this:

julia> rowvals(sparse(x'))
3-element Vector{Int64}:
 1
 1
 2

and sparse(x') ultimately calls copy(x') which calls SparseArrays.ftranspose((x').parent, x -> adjoint(copy(x)), Float64). Presumably they have put some thought into those methods, maybe this is OK? Then it's one line:

julia> ChainRulesCore.ProjectTo(x::LinearAlgebra.AdjOrTrans{<:Any,<:SparseMatrixCSC}) = ProjectTo(copy(x))

julia> ProjectTo(x')
ProjectTo{SparseMatrixCSC}(element = ProjectTo{Float64}(), axes = (Base.OneTo(3), Base.OneTo(3)), rowval = [1, 1, 2], nzranges = UnitRange{Int64}[1:1, 2:1, 2:3], colptr = [1, 2, 2, 4])

julia> ans(ones(3,3))
3×3 SparseMatrixCSC{Float64, Int64} with 3 stored entries:
1.0   ⋅   1.0
 ⋅    ⋅   1.0
 ⋅    ⋅    ⋅ 

The other possible path is to construct a different projector, which tries to return another Adjoint{SparseMatrixCSC} instead. This is what already happens for adjoint vectors, and perhaps it's slightly awkward that we've used up the name ProjectTo{Adjoint} for this purpose:

julia> y = sprand(10, 0.2)'
1×10 adjoint(::SparseVector{Float64, Int64}) with eltype Float64:
 0.0  0.0  0.0  0.0  0.0  0.0  0.554374  0.861901  0.0  0.0

julia> ProjectTo(y)
ProjectTo{Adjoint}(parent = ProjectTo{SparseVector}(element = ProjectTo{Float64}(), nzind = [7, 8], axes = (Base.OneTo(10),)),)

julia> ans(ones(1,10))
1×10 adjoint(::SparseVector{Float64, Int64}) with eltype Float64:
 0.0  0.0  0.0  0.0  0.0  0.0  1.0  1.0  0.0  0.0

Perhaps this could be more efficient than the first way. However, all of these sparse projections are pretty inefficient right now. They are more or less proof-of-concept implementations, awaiting someone with interest (and a use case!) to optimise them.

mcabbott avatar Oct 16 '21 02:10 mcabbott