SparseArrays.jl
SparseArrays.jl copied to clipboard
Regression for `mul!` from 1.9 to 1.10
This is about in-place multiplication mul!(b, A, x) of a sparse matrix A by a vector x.
From 1.9.3 to 1.10.0-rc1, this operation
- has gotten significantly slower
- has started allocating when
xis a view
MWE
using BenchmarkTools, LinearAlgebra, SparseArrays
function testmul(n)
A = sparse(Float64, I, n, n)
b = Vector{Float64}(undef, n)
@btime mul!($b, $A, x) setup=(x=ones($n))
@btime mul!($b, $A, x) setup=(x=view(ones($n, 1), :, 1))
return nothing
end
testmul(1000)
Results
| Julia 1.9 | Julia 1.10 | |
|---|---|---|
x vector |
2.135 μs (0 allocations) | 3.298 μs (0 allocations) |
x view |
2.502 μs (0 allocations) | 4.087 μs (1 allocation) |
Could you please investigate whether this is taking a different path then what it used to take? It could be that we missed some dispatch...
First investigations on Discourse: https://discourse.julialang.org/t/why-does-mul-u-a-v-allocate-when-a-is-sparse-and-u-v-are-views/105995/10?u=gdalle
Investigating the vector case first, and indeed the chain of functions has changed a lot.
In Julia 1.9, the call stack goes like this:
mul!(C, A, B)
mul!(C::StridedVecOrMat, A::AbstractSparseMatrixCSC, B::DenseInputVecOrMat, α::Number, β::Number)
In Julia 1.10, the call stack goes like that:
mul!(C, A, B)
mul!(y::AbstractVector, A::AbstractVecOrMat, x::AbstractVector, alpha::Number, beta::Number)
generic_matvecmul!(C::StridedVecOrMat, tA, A::SparseMatrixCSCUnion, B::DenseInputVector, _add::MulAddMul)
spdensemul!(C, tA, tB, A, B, _add)
_spmatmul!(C, A, B, α, β)
I have absolutely no idea what's going on. When I compare, on current master I get
julia> using BenchmarkTools, LinearAlgebra, SparseArrays
julia> n = 1000;
julia> A = sparse(Float64, I, n, n);
julia> b = Vector{Float64}(undef, n);
julia> x = ones(n);
julia> @btime SparseArrays._spmatmul!($b, $A, $x, true, false);
3.317 μs (0 allocations: 0 bytes)
julia> @btime mul!($b, $A, $x);
3.311 μs (0 allocations: 0 bytes)
1.568 μs (0 allocations: 0 bytes) # on v1.9
where _spmatmul! contains the multiplication kernel only (no character and dispatch overhead). So, it doesn't seem to be related to the changes in method dispatch AFAICT.
That's a bad regression, right? Maybe an issue on the Julia repo is in order if SparseArrays is not to blame?
Yes. It's also not clear to me why the rewrap works without allocation in the plain vector case, but allocates in the view case. Both run by the same code line.
The allocation is fixed on master now
julia> testmul(1000)
3.072 μs (0 allocations: 0 bytes)
3.441 μs (0 allocations: 0 bytes)
julia> VERSION
v"1.11.0-DEV.984"
I suspect this is the fix.
Is this resolved now?
The regression is still present. Here's what I see:
julia> testmul(1000)
2.148 μs (0 allocations: 0 bytes)
2.178 μs (0 allocations: 0 bytes)
julia> VERSION
v"1.9.4"
vs
julia> testmul(1000)
2.704 μs (0 allocations: 0 bytes)
3.025 μs (0 allocations: 0 bytes)
julia> VERSION
v"1.12.0-DEV.301"
@vtjnash @oscardssmith Any ideas what is going on here?
As https://github.com/JuliaLang/julia/issues/52137 shows, this is unrelated to this package and possibly something upstream.
If this was only in SparseArrays I would blame https://github.com/JuliaLang/julia/issues/54464 but https://github.com/JuliaLang/julia/issues/52137 (as already have been said) seem to exclude this from being a SparseArray specific issue.
Ref https://github.com/JuliaLang/julia/issues/52137#issuecomment-2139411363 for a bisect.