BandedMatrices.jl icon indicating copy to clipboard operation
BandedMatrices.jl copied to clipboard

Matrix-vector product with a transposed/adjoint `BandedMatrix` is slow

Open jishnub opened this issue 3 years ago • 0 comments

julia> B = BandedMatrix(1=>Float64.(1:7))
8×8 BandedMatrix{Float64} with bandwidths (-1, 1):
  ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅   2.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅   3.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅   4.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅   5.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   6.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   7.0
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 

julia> v = Float64[i for i in axes(B,2)];

julia> @btime transpose($B) * $v;  # uses gbmv
  989.600 ns (3 allocations: 304 bytes)

julia> @btime transpose(transpose(v) * $B);  # uses default_blasmul!
  254.820 ns (3 allocations: 160 bytes)

julia> @btime $B' * v;
  1.014 μs (2 allocations: 144 bytes)

julia> @btime ($v' * $B)';
  191.123 ns (1 allocation: 128 bytes)

Edit: This behavior seems to switch at larger sizes, so I'm not sure if there's a universal fix

julia> B = BandedMatrix(rand(100, 100));

julia> v = Float64[i for i in axes(B,2)];

julia> @btime $B' * v;
  1.755 μs (2 allocations: 912 bytes)

julia> @btime ($v' * $B)';
  4.973 μs (1 allocation: 896 bytes)

jishnub avatar Nov 16 '22 09:11 jishnub