SparseArrays.jl Regression for `mul!` from 1.9 to 1.10

This is about in-place multiplication mul!(b, A, x) of a sparse matrix A by a vector x. From 1.9.3 to 1.10.0-rc1, this operation

has gotten significantly slower
has started allocating when x is a view

MWE

using BenchmarkTools, LinearAlgebra, SparseArrays

function testmul(n)
    A = sparse(Float64, I, n, n)
    b = Vector{Float64}(undef, n)
    @btime mul!($b, $A, x) setup=(x=ones($n))
    @btime mul!($b, $A, x) setup=(x=view(ones($n, 1), :, 1))
    return nothing
end

testmul(1000)

Results

	Julia 1.9	Julia 1.10
`x` vector	2.135 μs (0 allocations)	3.298 μs (0 allocations)
`x` view	2.502 μs (0 allocations)	4.087 μs (1 allocation)

Nov 09 '23 18:11 gdalle

Could you please investigate whether this is taking a different path then what it used to take? It could be that we missed some dispatch...

Nov 09 '23 21:11 dkarrasch

First investigations on Discourse: https://discourse.julialang.org/t/why-does-mul-u-a-v-allocate-when-a-is-sparse-and-u-v-are-views/105995/10?u=gdalle

Nov 10 '23 14:11 gdalle

Investigating the vector case first, and indeed the chain of functions has changed a lot.

In Julia 1.9, the call stack goes like this:

mul!(C, A, B)
mul!(C::StridedVecOrMat, A::AbstractSparseMatrixCSC, B::DenseInputVecOrMat, α::Number, β::Number)

In Julia 1.10, the call stack goes like that:

mul!(C, A, B)
mul!(y::AbstractVector, A::AbstractVecOrMat, x::AbstractVector, alpha::Number, beta::Number)
generic_matvecmul!(C::StridedVecOrMat, tA, A::SparseMatrixCSCUnion, B::DenseInputVector, _add::MulAddMul)
spdensemul!(C, tA, tB, A, B, _add)
_spmatmul!(C, A, B, α, β)

Nov 10 '23 20:11 gdalle

I have absolutely no idea what's going on. When I compare, on current master I get

julia> using BenchmarkTools, LinearAlgebra, SparseArrays

julia> n = 1000;

julia> A = sparse(Float64, I, n, n);

julia> b = Vector{Float64}(undef, n);

julia> x = ones(n);

julia> @btime SparseArrays._spmatmul!($b, $A, $x, true, false);
  3.317 μs (0 allocations: 0 bytes)

julia> @btime mul!($b, $A, $x);
  3.311 μs (0 allocations: 0 bytes)
 1.568 μs (0 allocations: 0 bytes) # on v1.9

where _spmatmul! contains the multiplication kernel only (no character and dispatch overhead). So, it doesn't seem to be related to the changes in method dispatch AFAICT.

Nov 12 '23 13:11 dkarrasch

That's a bad regression, right? Maybe an issue on the Julia repo is in order if SparseArrays is not to blame?

Nov 12 '23 13:11 gdalle

Yes. It's also not clear to me why the rewrap works without allocation in the plain vector case, but allocates in the view case. Both run by the same code line.

Nov 12 '23 14:11 dkarrasch

The allocation is fixed on master now


julia> testmul(1000)
  3.072 μs (0 allocations: 0 bytes)
  3.441 μs (0 allocations: 0 bytes)

julia> VERSION
v"1.11.0-DEV.984"

I suspect this is the fix.

Nov 25 '23 06:11 jishnub

Is this resolved now?

Apr 02 '24 16:04 ViralBShah

The regression is still present. Here's what I see:

julia> testmul(1000)
  2.148 μs (0 allocations: 0 bytes)
  2.178 μs (0 allocations: 0 bytes)

julia> VERSION
v"1.9.4"

vs

julia> testmul(1000)
  2.704 μs (0 allocations: 0 bytes)
  3.025 μs (0 allocations: 0 bytes)

julia> VERSION
v"1.12.0-DEV.301"

Apr 09 '24 11:04 jishnub

@vtjnash @oscardssmith Any ideas what is going on here?

Apr 09 '24 13:04 ViralBShah

As https://github.com/JuliaLang/julia/issues/52137 shows, this is unrelated to this package and possibly something upstream.

Apr 09 '24 16:04 dkarrasch

If this was only in SparseArrays I would blame https://github.com/JuliaLang/julia/issues/54464 but https://github.com/JuliaLang/julia/issues/52137 (as already have been said) seem to exclude this from being a SparseArray specific issue.

May 30 '24 11:05 KristofferC

Ref https://github.com/JuliaLang/julia/issues/52137#issuecomment-2139411363 for a bisect.

May 30 '24 12:05 KristofferC

SparseArrays.jl SparseArrays.jl copied to clipboard

Regression for `mul!` from 1.9 to 1.10

SparseArrays.jl
SparseArrays.jl copied to clipboard