libCEED icon indicating copy to clipboard operation
libCEED copied to clipboard

BDDC Example

Open jeremylt opened this issue 3 years ago • 37 comments

Nothing much to look at here yet. I'm just putting this here to make it easier to see/comment on what I'm doing.

jeremylt avatar Apr 06 '21 16:04 jeremylt

There is some debugging to do after #744 merges, but this should be close. The open task is prolonging into the broken space - We don't really have a way to go from 1 qfunction output into a pair of target vectors

jeremylt avatar Apr 16 '21 20:04 jeremylt

I have not started debugging yet, but all of the pieces I want are there. Here's to hoping

jeremylt avatar Apr 18 '21 22:04 jeremylt

Huzzah - it compiles. Now to see what else is broken about it.

jeremylt avatar Apr 19 '21 16:04 jeremylt

And a bug in FDM inverse is blocking further testing. Investigating in separate branch.

jeremylt avatar Apr 19 '21 19:04 jeremylt

it does not converge, but it all runs now

new errors are good errors

jeremylt avatar Apr 21 '21 21:04 jeremylt

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : DIVERGED_DTOL
    Total KSP Iterations               : 682
    Final rnorm                        : 4.136087e+05
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 2.211230e+01
    CG Solve Time                      : 0.0913367 (0.0913367) sec

Now I'm diverging. New new error state, as previously I had 0 KSP its.

I think there are more issues with the FDM.

jeremylt avatar Apr 22 '21 21:04 jeremylt

ooh, now I'm diverging with an indefinite PC, but closer to the solution

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : DIVERGED_INDEFINITE_PC
    Total KSP Iterations               : 26
    Final rnorm                        : 1.858340e-05
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 4.219564e-02
    CG Solve Time                      : 0.00492215 (0.00492215) sec

jeremylt avatar Apr 23 '21 18:04 jeremylt

And the final error is within the cutoff we use for the test suite, so maybe it is working?

jeremylt avatar Apr 23 '21 18:04 jeremylt

PC should be SPD. Are you doing the lumped variant or the "Dirichlet" version (harmonic extension in the balancing)? You can use PCComputeExplicitOperator to see if it's symmetric and positive definite.

jedbrown avatar Apr 23 '21 18:04 jedbrown

I'm starting with the lumped variant and then I'll add the Dirichlet version once the lumped works

jeremylt avatar Apr 23 '21 18:04 jeremylt

You could make the elements macro-elements containing Q1 sub-elements, then compare with our BDDC LFA paper and the corresponding results in PETSc. I don't know if that would take longer than "just find the bug". There's a high-order FE BDDC example in PETSc that could also be used, at one element per process.

jedbrown avatar Apr 23 '21 19:04 jedbrown

"Just find the bug" was faster in this case.

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  BDDC:
    Injection                          : scaled
    Global Interface Nodes             : 8
    Owned Interface Nodes              : 8
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 49
    Final rnorm                        : 9.363374e-10
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 4.185030e-02
    CG Solve Time                      : 0.00968734 (0.00968734) sec

jeremylt avatar Apr 23 '21 19:04 jeremylt

Since this converges and is leak free, I'm tagging as 'In Review"

#749 and #750 are prerequsites

jeremylt avatar Apr 23 '21 19:04 jeremylt

Well, I added the harmonic extension but something is screwy

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 729
    Owned Nodes                        : 729
    DoF per node                       : 1
  BDDC:
    Injection                          : harmonic
    Global Interface Nodes             : 64
    Owned Interface Nodes              : 64
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 207
    Final rnorm                        : 1.280022e-09
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 3.626101e-02
    CG Solve Time                      : 0.21704 (0.21704) sec

jeremylt avatar Apr 27 '21 22:04 jeremylt

And bp1 stopped converging somewhere in there.

jeremylt avatar Apr 27 '21 22:04 jeremylt

Something is screwy, I found a way to make it even worse

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  BDDC:
    Injection                          : scaled
    Global Interface Nodes             : 8
    Owned Interface Nodes              : 8
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 61
    Final rnorm                        : 7.065059e-10
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 4.185030e-02
    CG Solve Time                      : 0.0171467 (0.0171467) sec

jeremylt avatar Apr 30 '21 23:04 jeremylt

With the latest fixes in #749

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  BDDC:
    Injection                          : scaled
    Global Interface Nodes             : 8
    Owned Interface Nodes              : 8
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 17
    Final rnorm                        : 2.510341e-10
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 4.185030e-02
    CG Solve Time                      : 0.0300374 (0.0300374) sec

Naturally, this still isn't quite right, but it's less bad than it was

jeremylt avatar May 10 '21 19:05 jeremylt

Latest fdm change made things worse again

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/avx/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  BDDC:
    Injection                          : scaled
    Global Interface Nodes             : 8
    Owned Interface Nodes              : 8
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 26
    Final rnorm                        : 6.412850e-10
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 4.185030e-02
    CG Solve Time                      : 0.0085119 (0.0085119) sec

jeremylt avatar May 11 '21 22:05 jeremylt

The number of iterations grows less rapidly when I increase the number of cells though, so that's a win

jeremylt avatar May 11 '21 23:05 jeremylt

@jeremylt @jedbrown I did not know you got a working BDDC code. I was wondering if the Arrinv solve can be hooked up in PCBDDC as subdomain solver

stefanozampini avatar Oct 01 '21 20:10 stefanozampini

Are you wanting to use PCBDDC with one subdomain per process and put libCEED's Arrinv as subdomain solver? Or would you make PCBDDC work with many subdomains with hooks so that libCEED is responsible for (batched) matrix-free operators?

I would really like to use adaptive coarse basis construction in our framework, still with separable element solves.

jedbrown avatar Oct 01 '21 20:10 jedbrown

The idea would be to reuse PCBDDC code, and hook up your specialized solvers for interior and Arr. Adaptive coarse spaces can be built provided we have the explicit Schur complement. I have an old branch where I started supporting multiple subdomains per process https://gitlab.com/petsc/petsc/-/tree/stefanozampini/bddc-ceed. I'm also interested in this

stefanozampini avatar Oct 01 '21 20:10 stefanozampini

Note: This code mostly works. There is some small bug that is killing our convergence that I haven't had space time to chase down.

jeremylt avatar Oct 01 '21 21:10 jeremylt

Note: This code mostly works. There is some small bug that is killing our convergence that I haven't had space time to chase down.

Getting BDDC to work can be painful, I know that :-)

stefanozampini avatar Oct 01 '21 21:10 stefanozampini

Rebased for changes on main. Same slow convergence, but it does converge.

-- CEED Benchmark Problem 3 -- libCEED + PETSc + BDDC --
  PETSc:
    PETSc Vec Type                     : seq
  libCEED:
    libCEED Backend                    : /cpu/self/xsmm/blocked
    libCEED Backend MemType            : host
  Mesh:
    Number of 1D Basis Nodes (p)       : 3
    Number of 1D Quadrature Points (q) : 4
    Global Nodes                       : 125
    Owned Nodes                        : 125
    DoF per node                       : 1
  BDDC:
    Injection                          : scaled
    Global Interface Nodes             : 8
    Owned Interface Nodes              : 8
  KSP:
    KSP Type                           : cg
    KSP Convergence                    : CONVERGED_RTOL
    Total KSP Iterations               : 26
    Final rnorm                        : 1.132970e-09
  BDDC:
    PC Type                            : shell
  Performance:
    Pointwise Error (max)              : 4.185030e-02
    CG Solve Time                      : 0.0201494 (0.0201494) sec

jeremylt avatar Jan 19 '22 18:01 jeremylt

How does the iteration count and condition number (-ksp_view_singularvalues) vary under grid refinement?

jedbrown avatar Jan 19 '22 18:01 jedbrown

Size Its Singular Values
3x3x3 26 max 12.0304 min 0.305409 max/min 39.3913
5x5x5 64 max 12.2401 min 0.264016 max/min 46.3613
10x10x10 79 max 12.4294 min 0.108125 max/min 114.954
15x15x15 110 max 14.8259 min 0.0588987 max/min 251.719

jeremylt avatar Jan 19 '22 18:01 jeremylt

This is for 3D with only corners as primal dofs?

jedbrown avatar Jan 19 '22 18:01 jedbrown

Correct, 3D 2nd order basis as the fine mesh with corners only as the vertex space

jeremylt avatar Jan 19 '22 18:01 jeremylt

Can you compare with PCBDDC using src/ksp/ksp/tutorials/ex59.c?

jedbrown avatar Jan 19 '22 18:01 jedbrown