BlockBandedMatrices.jl
BlockBandedMatrices.jl copied to clipboard
New type of BlockTridiagonalKron (or something)
Combination code and basic linear algebra question for @dlfivefifty and any other experts: We have a very particular structure for some of my sparse matrices, which is common enough in economics to be worth specializing types:
- Take a sequence of
Tridiagonalmatrices of size MxM,L_1, ... L_Nand a square (possibly dense) matrixQof size NxN, which is an intensity matrix of a markov chain (i.e. q_ii = - sum(q_ij | j != i)) - Then you can construct a sparse matrix with code like the following
L = blockdiag(L_1, L_2, ..., L_N) + kron(Q, sparse(I, M, M))
- The structure of this is: a block matrix with tridiagonal matrices down the diagonal, and a bunch of constant coefficient diagonal matrices on all off-diagonal blocks. Lots of structure.
While I can represent this in a BlockBanded matrix, it is inefficient in space (since it has to have the same bandwidth for each block). More importantly, there is structure there that might be exploited to speed up linear solves.
So, with that in mind, what do you think about both the feasibility and payoff of creating a specialized matrix type of that form that can exploit the structure for fast linear solves?
This is where I am entirely ignorant... But looking at https://github.com/JuliaMatrices/BlockBandedMatrices.jl/blob/master/src/blockskylineqr.jl#L19 it seems like the QR decomposition can be done in blocks in some way? The QR for every block other than the diagonal is trivial. Is this both something feasible without too much specialized knowledge, and likely to pay off big for the speed of linear solves?
If you think it worthwhile, we could create this type and add it to this package.
Why not do it the other way around: kron(sparse(I, M, M), Q). Then you have dense blocks on the diagonal, with diagonal off-diagonal blocks. This would be fairly efficient as a BlockBandedMatrix, though you don't take advantage of sparsity of the off-diagonal blocks.
QR decomposition for BlockSkylineMatrix (which BlockBandedMatrix is a special case) does indeed use LAPACK for the QR: it takes a view of all blocks diagonal and below (A[Block.(k:k+A.l),Block(k)]) which is memory-compatible with a strided matrix, and hence LAPACK can be used. I've been meaning to write a blog post on this (but swamped with other things, unfortunately).
Sorry, that was just a simple way to generate the matrix as something sparse, but there may be other ones. Your suggestion generated a different matrix, unless I made a mistake
To be concrete here is some working code to create the matrices.
using LinearAlgebra, SparseArrays, BlockBandedMatrices
# in reality, these would be more complicated and huge
L_1 = Tridiagonal([1, 1, 1], [-2, -2, -2, -2], [1, 1, 1])
L_2 = Tridiagonal([1, 1, 1], [-2, -2, -2, -3], [1, 1, 1])
Ls = (L_1, L_2)
N = length(Ls)
M = size(L_1, 1)
# fully dense in this case, but maybe sparse later and probably banded
Q = [-0.1 0.1; 0.2 -0.2]
Q_bandwidths = (1,1)
@assert N == size(Q, 1) == size(Q, 2)
L = blockdiag(sparse.(Ls)...) + kron(Q, sparse(I, M, M))
Can we set this up as a BandedBlockBandedMatrix? Sure:
using BlockBandedMatrices
Lblock = BandedBlockBandedMatrix{Float64}(Matrix(L), ([M,M],[M,M]), (1,1), Q_bandwidths)
But looking first at storage (with Q dense for now)
- With the banded setup above, approximately
3M N^2nonzeros - With sparse setup, approximately
3MN + M(N^2 - N) - But if we consider that the off-diagonals are constant:
3MN + N^2
For a system of M=1000 and N=500, the differences are:
- 750,000,000, with bandedblockbanded with the shared bandwidth
- 251,000,000, with sparse setup, since it doesn't force the higher bandwidth for off-block diagonals
- 1,750,000, with banded diagonal and specialized storage that off-block diagonals are constant diagonals
I know how to do the matrix-vector multiplications to specialize the type, but as for algorithms for linear solves, there I am ignorant.
But it seems to me that LU or QR algorithms could exploit that the off-block-diagonals bandwidths are just diagonals, or (even better) that off-block-diagonals are constant diagonals?
My suggestion is equivalent to permuting the rows/columns to interchange the notion of blocks and entries in the blocks. Essentially a “transpose” if you reinterpret a blocked vector as a matrix. So the linear system is equivalent and you can take advantage of BlockBandedMatrix
I think I "kind of" get what you are saying here for the kron part, but not sure about how the rest of it would work (i.e. the blockdiag in the way I have constructed it)
L = blockdiag(sparse.(Ls)...) + kron(Q, sparse(I, M, M))
This may be out of my paygrade.
But lets make a chance to my setup: Instead of Q being dense, lets say that it is also banded, and of roughly the same bandwidth as the Ls. If so, then does the transposing with the kron help?
And, if not, and the key really is in the use of BlockBanded, are there any examples of how to construct a block banded with different matrix types for the diagonal vs. the off-diagonal blocks?
I take that back. I now fully get what you are saying about interchanging the dimensions. But I still don't understand the code to work with different matrix types in the block-diagonal vs. off-diagonals.
You probably want to use a BlockArray wrapping a BandedMatrix{Union{DTyp,OffDTyp}} and make sure it implements the block-Banded interface. None of this has been done yet, and will only help with mat * vec and mat * mat: linear solves have full in.
Alternatively, your matrix will be Banded with bandwidth proportional to the block size, so you could just wrap a BandedMatrix{Float64} with a PseudoBlockMatrix