SparseArrays.jl
SparseArrays.jl copied to clipboard
broadcast(/, ::SparseMatrixCSC, ::AbstractArray) stores unnecessary zeros
If A is a sparse matrix, B is a sparse or dense vector of non-zeros with appropriate size, then
C = A ./ B and C = A ./ B' return sparse matrices, which have nnz(C) == prod(size(C)); if A has a good number of structural zeros (few stored values), C has stored many zeros and the size of fields C.rowvaland C.nzval is extremely oversized.
That is not the case for A .* B, which has the expected sparsity structure like A.
The attempt A .* ( 1.0 ./ B) fails, though.
A work-around has to look like B2 = 1.0 ./ B; A .* B2.
Am I correct this is specifically about division? Classic example:
julia> spzeros(5,5) ./ ones(5)
5×5 SparseArrays.SparseMatrixCSC{Float64,Int64} with 25 stored entries:
[1, 1] = 0.0
[2, 1] = 0.0
[3, 1] = 0.0
⋮
It is not only division, but all binary operators f with f(zero(), zero()) != zero(). In the division case that is NaN != zero(). It is a consequence of the implementation of _broadcast_nonzeropres! in SparseArrays/src/higherorderfns.jl.
Right. The general case is all n-ary functions f with f(zeros(n)...) != 0 and with f(args...) == 0 when some nonzero (array) arguments exist in args. Division is the most common operator with the property that you only care about nonzeros (and not a particular nonzero value) in determining the output structure.
I know Sacha has talked about further possible optimizations to the sparse broadcasting code — particularly with respect to combinations of sparse and dense and other abstract array types. There's definitely room for improvement.
Tremendous room for improvement in this case, yes :). If memory serves, discussion of this particular shortcoming exists in at least one prior thread, but I cannot find said thread at the moment. Best!
Would trait-ish things work here?
Something like
struct GenerallyDoesNotTouchZeros end
struct NormallyTouchesZeros end
sparse_lhs_style(op, rhs) = NormallyTouchesZeros()
sparse_lhs_style(broadcasted(*), rhs) = GenerallyDoesNotTouchZeros()
sparse_lhs_style(broadcasted(/), rhs) = GenerallyDoesNotTouchZeros()
#...
touches_zeros_for_rhs_element(op, rhs) = true
touches_zeros_for_rhs_element(broadcasted(*), rhs) = !isfinite(rhs)
touches_zeros_for_rhs_element(broadcasted(/), rhs) = isnan(rhs) || rhs==0
This issue can be substantially improved without changing the overall approach to sparse broadcast. Regarding changing said overall approach, I hope you didn't have any aspirations of getting anything done today: https://github.com/JuliaLang/julia/pull/18590 😉. Best!
Just checking in to see if anyone had made progress fixing this bug. I found my way here from: https://discourse.julialang.org/t/sparse-broadcasting-division-with-a-dense-vector-yields-a-dense-array/15005
I also ran into this issue today and am currently using the workaround that @KlausC suggested