SparseArrays.jl icon indicating copy to clipboard operation
SparseArrays.jl copied to clipboard

broadcast(/, ::SparseMatrixCSC, ::AbstractArray) stores unnecessary zeros

Open KlausC opened this issue 7 years ago • 10 comments

If A is a sparse matrix, B is a sparse or dense vector of non-zeros with appropriate size, then C = A ./ B and C = A ./ B' return sparse matrices, which have nnz(C) == prod(size(C)); if A has a good number of structural zeros (few stored values), C has stored many zeros and the size of fields C.rowvaland C.nzval is extremely oversized.

That is not the case for A .* B, which has the expected sparsity structure like A. The attempt A .* ( 1.0 ./ B) fails, though.

A work-around has to look like B2 = 1.0 ./ B; A .* B2.

KlausC avatar Mar 21 '18 17:03 KlausC

Am I correct this is specifically about division? Classic example:

julia> spzeros(5,5) ./ ones(5)
5×5 SparseArrays.SparseMatrixCSC{Float64,Int64} with 25 stored entries:
  [1, 1]  =  0.0
  [2, 1]  =  0.0
  [3, 1]  =  0.0
 ⋮

mbauman avatar Mar 21 '18 18:03 mbauman

It is not only division, but all binary operators f with f(zero(), zero()) != zero(). In the division case that is NaN != zero(). It is a consequence of the implementation of _broadcast_nonzeropres! in SparseArrays/src/higherorderfns.jl.

KlausC avatar Mar 21 '18 19:03 KlausC

Right. The general case is all n-ary functions f with f(zeros(n)...) != 0 and with f(args...) == 0 when some nonzero (array) arguments exist in args. Division is the most common operator with the property that you only care about nonzeros (and not a particular nonzero value) in determining the output structure.

I know Sacha has talked about further possible optimizations to the sparse broadcasting code — particularly with respect to combinations of sparse and dense and other abstract array types. There's definitely room for improvement.

mbauman avatar Mar 21 '18 20:03 mbauman

Tremendous room for improvement in this case, yes :). If memory serves, discussion of this particular shortcoming exists in at least one prior thread, but I cannot find said thread at the moment. Best!

Sacha0 avatar Mar 22 '18 17:03 Sacha0

I cannot find said thread at the moment

Perhaps here?

garrison avatar Mar 30 '18 22:03 garrison

Perhaps here?

Not quite the one I was looking for, somewhat different issue :).

Sacha0 avatar Sep 30 '18 17:09 Sacha0

Would trait-ish things work here?

Something like

struct GenerallyDoesNotTouchZeros end
struct NormallyTouchesZeros end

sparse_lhs_style(op, rhs) = NormallyTouchesZeros()
sparse_lhs_style(broadcasted(*), rhs) = GenerallyDoesNotTouchZeros()
sparse_lhs_style(broadcasted(/), rhs) = GenerallyDoesNotTouchZeros()
#...
touches_zeros_for_rhs_element(op, rhs) = true
touches_zeros_for_rhs_element(broadcasted(*), rhs) = !isfinite(rhs)
touches_zeros_for_rhs_element(broadcasted(/), rhs) = isnan(rhs) || rhs==0

oxinabox avatar Oct 05 '18 13:10 oxinabox

This issue can be substantially improved without changing the overall approach to sparse broadcast. Regarding changing said overall approach, I hope you didn't have any aspirations of getting anything done today: https://github.com/JuliaLang/julia/pull/18590 😉. Best!

Sacha0 avatar Oct 05 '18 17:10 Sacha0

Just checking in to see if anyone had made progress fixing this bug. I found my way here from: https://discourse.julialang.org/t/sparse-broadcasting-division-with-a-dense-vector-yields-a-dense-array/15005

btmit avatar Jul 08 '21 00:07 btmit

I also ran into this issue today and am currently using the workaround that @KlausC suggested

evanmunro avatar Jun 08 '23 23:06 evanmunro