BlockArrays.jl icon indicating copy to clipboard operation
BlockArrays.jl copied to clipboard

ldiv for PseudoBlockMatrix two times slower than for native Matrix

Open krcools opened this issue 2 years ago • 2 comments

PseudoBlockMatrix is a structured wrapper to a native array. This is why the following came as a surprise:

using BlockArrays

n = 800
M = rand(2n,2n)
v = rand(2n)

pbM = PseudoBlockMatrix(M,[n,n],[n,n])
pbv = PseudoBlockVector(v,[n,n])

using BlockArrays.ArrayLayouts

MemoryLayout(M) == MemoryLayout(pbM)
MemoryLayout(v) == MemoryLayout(pbv)

@time u = M \ v;
@time pbu = pbM \ pbv;

output:

0.053086 seconds (4 allocations: 19.556 MiB, 9.18% gc time)
0.126742 seconds (19 allocations: 39.992 MiB, 2.13% gc time)

For matrices of larger size it remains true that solving the system with BlockPseudoArrays is two times slower. Where could this originate from?

krcools avatar Oct 06 '21 16:10 krcools

Ah it's because ArrayLayouts.jl defaults to QR factorization, since I believe it's more likely to be stable for structured matrices E.g. for a BlockBandedMatrix doing block LU may introduce instability. Though this could be changed.

In any case this PR into ArrayLayouts.jl makes LU the default for strided arrays:

julia> @time u = M \ v;
  0.030652 seconds (4 allocations: 19.556 MiB)

julia> @time pbu = pbM \ pbv;
  0.028882 seconds (13 allocations: 19.581 MiB)

Let me know if that works for you.

dlfivefifty avatar Oct 07 '21 09:10 dlfivefifty

Thanks! That PR fixed it for me:

0.057716 seconds (4 allocations: 19.556 MiB)
0.056219 seconds (13 allocations: 19.581 MiB, 10.98% gc time)

krcools avatar Oct 07 '21 09:10 krcools