libCEED
libCEED copied to clipboard
BDDC Example
Nothing much to look at here yet. I'm just putting this here to make it easier to see/comment on what I'm doing.
There is some debugging to do after #744 merges, but this should be close. The open task is prolonging into the broken space - We don't really have a way to go from 1 qfunction output into a pair of target vectors
I have not started debugging yet, but all of the pieces I want are there. Here's to hoping
Huzzah - it compiles. Now to see what else is broken about it.
And a bug in FDM inverse is blocking further testing. Investigating in separate branch.
it does not converge, but it all runs now
new errors are good errors
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
KSP:
KSP Type : cg
KSP Convergence : DIVERGED_DTOL
Total KSP Iterations : 682
Final rnorm : 4.136087e+05
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 2.211230e+01
CG Solve Time : 0.0913367 (0.0913367) sec
Now I'm diverging. New new error state, as previously I had 0 KSP its.
I think there are more issues with the FDM.
ooh, now I'm diverging with an indefinite PC, but closer to the solution
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
KSP:
KSP Type : cg
KSP Convergence : DIVERGED_INDEFINITE_PC
Total KSP Iterations : 26
Final rnorm : 1.858340e-05
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 4.219564e-02
CG Solve Time : 0.00492215 (0.00492215) sec
And the final error is within the cutoff we use for the test suite, so maybe it is working?
PC should be SPD. Are you doing the lumped variant or the "Dirichlet" version (harmonic extension in the balancing)? You can use PCComputeExplicitOperator
to see if it's symmetric and positive definite.
I'm starting with the lumped variant and then I'll add the Dirichlet version once the lumped works
You could make the elements macro-elements containing Q1 sub-elements, then compare with our BDDC LFA paper and the corresponding results in PETSc. I don't know if that would take longer than "just find the bug". There's a high-order FE BDDC example in PETSc that could also be used, at one element per process.
"Just find the bug" was faster in this case.
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
BDDC:
Injection : scaled
Global Interface Nodes : 8
Owned Interface Nodes : 8
KSP:
KSP Type : cg
KSP Convergence : CONVERGED_RTOL
Total KSP Iterations : 49
Final rnorm : 9.363374e-10
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 4.185030e-02
CG Solve Time : 0.00968734 (0.00968734) sec
Since this converges and is leak free, I'm tagging as 'In Review"
#749 and #750 are prerequsites
Well, I added the harmonic extension but something is screwy
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 729
Owned Nodes : 729
DoF per node : 1
BDDC:
Injection : harmonic
Global Interface Nodes : 64
Owned Interface Nodes : 64
KSP:
KSP Type : cg
KSP Convergence : CONVERGED_RTOL
Total KSP Iterations : 207
Final rnorm : 1.280022e-09
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 3.626101e-02
CG Solve Time : 0.21704 (0.21704) sec
And bp1 stopped converging somewhere in there.
Something is screwy, I found a way to make it even worse
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
BDDC:
Injection : scaled
Global Interface Nodes : 8
Owned Interface Nodes : 8
KSP:
KSP Type : cg
KSP Convergence : CONVERGED_RTOL
Total KSP Iterations : 61
Final rnorm : 7.065059e-10
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 4.185030e-02
CG Solve Time : 0.0171467 (0.0171467) sec
With the latest fixes in #749
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
BDDC:
Injection : scaled
Global Interface Nodes : 8
Owned Interface Nodes : 8
KSP:
KSP Type : cg
KSP Convergence : CONVERGED_RTOL
Total KSP Iterations : 17
Final rnorm : 2.510341e-10
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 4.185030e-02
CG Solve Time : 0.0300374 (0.0300374) sec
Naturally, this still isn't quite right, but it's less bad than it was
Latest fdm change made things worse again
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/avx/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
BDDC:
Injection : scaled
Global Interface Nodes : 8
Owned Interface Nodes : 8
KSP:
KSP Type : cg
KSP Convergence : CONVERGED_RTOL
Total KSP Iterations : 26
Final rnorm : 6.412850e-10
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 4.185030e-02
CG Solve Time : 0.0085119 (0.0085119) sec
The number of iterations grows less rapidly when I increase the number of cells though, so that's a win
@jeremylt @jedbrown I did not know you got a working BDDC code. I was wondering if the Arrinv
solve can be hooked up in PCBDDC
as subdomain solver
Are you wanting to use PCBDDC with one subdomain per process and put libCEED's Arrinv as subdomain solver? Or would you make PCBDDC work with many subdomains with hooks so that libCEED is responsible for (batched) matrix-free operators?
I would really like to use adaptive coarse basis construction in our framework, still with separable element solves.
The idea would be to reuse PCBDDC
code, and hook up your specialized solvers for interior and Arr
. Adaptive coarse spaces can be built provided we have the explicit Schur complement. I have an old branch where I started supporting multiple subdomains per process https://gitlab.com/petsc/petsc/-/tree/stefanozampini/bddc-ceed. I'm also interested in this
Note: This code mostly works. There is some small bug that is killing our convergence that I haven't had space time to chase down.
Note: This code mostly works. There is some small bug that is killing our convergence that I haven't had space time to chase down.
Getting BDDC to work can be painful, I know that :-)
Rebased for changes on main. Same slow convergence, but it does converge.
-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
PETSc:
PETSc Vec Type : seq
libCEED:
libCEED Backend : /cpu/self/xsmm/blocked
libCEED Backend MemType : host
Mesh:
Number of 1D Basis Nodes (p) : 3
Number of 1D Quadrature Points (q) : 4
Global Nodes : 125
Owned Nodes : 125
DoF per node : 1
BDDC:
Injection : scaled
Global Interface Nodes : 8
Owned Interface Nodes : 8
KSP:
KSP Type : cg
KSP Convergence : CONVERGED_RTOL
Total KSP Iterations : 26
Final rnorm : 1.132970e-09
BDDC:
PC Type : shell
Performance:
Pointwise Error (max) : 4.185030e-02
CG Solve Time : 0.0201494 (0.0201494) sec
How does the iteration count and condition number (-ksp_view_singularvalues
) vary under grid refinement?
Size | Its | Singular Values |
---|---|---|
3x3x3 | 26 | max 12.0304 min 0.305409 max/min 39.3913 |
5x5x5 | 64 | max 12.2401 min 0.264016 max/min 46.3613 |
10x10x10 | 79 | max 12.4294 min 0.108125 max/min 114.954 |
15x15x15 | 110 | max 14.8259 min 0.0588987 max/min 251.719 |
This is for 3D with only corners as primal dofs?
Correct, 3D 2nd order basis as the fine mesh with corners only as the vertex space
Can you compare with PCBDDC using src/ksp/ksp/tutorials/ex59.c?