AlgebraicMultigrid.jl Use sparse LU as a faster coarse solver than Pinv

trafficstars

The only available CoarseSolver is Pinv, which gets very slow if the coarsest-level A is large, for example in a two-grid setting instead of multigrid V-cycle. I wrote a code demo to use UMFPACK sparse LU to solve the coarse equation. It shows significant speed-up in both AMG setup phase and solve phase. I can make a PR if useful.

Current performance with dense `Pinv`

import Random: seed!
using SparseArrays
using LinearAlgebra
using AlgebraicMultigrid
import AlgebraicMultigrid as AMG
using BenchmarkTools

# Poisson matrix also makes the point, but the second-level A with Ruge-Stuben coarsening is a bit too dense,
# so the gap between sparse and dense solves is not significant enough.
# Here use a very sparse random matrix that lead to a sparse-enough second level.
# A = poisson((64, 64))
seed!(0)
A = sprand(8000, 8000, 1e-4) * 0.1 + I  #  add to diagonal to avoid singular matrix
b = A * ones(size(A)[1])

ruge_stuben(A, max_levels=10)  # fast if coarsest A is very small
@time ml = ruge_stuben(A, max_levels=2)  # takes 5 seconds, mostly spends on `inv(Matrix(ml.final_A))`
@btime AMG._solve(ml, b, maxiter=10);  # takes 16 ms

Use Sparse LU to improve performance

import SuiteSparse.UMFPACK: UmfpackLU
import AlgebraicMultigrid: CoarseSolver

# note: UMFPACK only supports Float64 or ComplexF64
# not sure how to best restrict the types
struct Splu <: CoarseSolver
    LU::UmfpackLU
    Splu(A) = new(lu(A))
end

Base.show(io::IO, p::Splu) = print(io, "Splu")

function (p::Splu)(x, b)
    x[:] = p.LU \ b
end

ml_sp = ruge_stuben(A, max_levels=2, coarse_solver=Splu)  
@btime ruge_stuben(A, max_levels=2, coarse_solver=Splu)  # takes 6 ms, ~1000x faster than before!
@btime AMG._solve(ml_sp, b, maxiter=10);  # takes 9 ms, 2x faster than before

Side note 1: PyAMG also supports splu as coarse solver. Its code logic is to run the LU factorization in the first iteration of the solve phase, and reuse the LU factors in later iterations. I think a more natural logic is to compute the LU factors in setup phase, so it is easier to separately benchmark factor vs solve.
Side note 2: (In case someone is interested in doing so) The sparse LU apply stage breaks Zygote autodiff https://github.com/FluxML/Zygote.jl/issues/1168
Side note 3: Might also be useful to call Pardiso.jl instead of Umfpack.

Feb 25 '22 10:02 learning-chip

Instead, it should just be setup to use LinearSolve.jl so that downstream users can choose which linear solver to use in this step.

Feb 25 '22 10:02 ChrisRackauckas

it should just be setup to use LinearSolve.jl

Interesting, then I would argue that presmoother and postsmoother should also be flexible, generic approximate/iterative solvers. For complicated (say indefinite, nonsymmetric) problems, it makes sense to use GMRES as smoother (as an option in PyAMG) or ILU as smoother. Many are defined in IterativeSolvers.jl or other existing solver packages, but currently AMG.jl only allows GaussSeidel and Jacobi.

How should the new interface look like then? What would be the first step to implement those?

Feb 25 '22 10:02 learning-chip

How should the new interface look like then?

ruge_stuben(A, max_levels=2, linsolve = KLUFactorization(), presmoother = Krylov_GMRES()) just passing methods for the solvers http://linearsolve.sciml.ai/dev/solvers/solvers/.

Feb 25 '22 10:02 ChrisRackauckas

Happy to give folks commit access to the repo, if they have bandwidth to make these improvements.

Nov 06 '22 02:11 ViralBShah

AlgebraicMultigrid.jl AlgebraicMultigrid.jl copied to clipboard

Use sparse LU as a faster coarse solver than Pinv

Current performance with dense Pinv

Use Sparse LU to improve performance

AlgebraicMultigrid.jl
AlgebraicMultigrid.jl copied to clipboard

Current performance with dense `Pinv`