Oceananigans.jl
Oceananigans.jl copied to clipboard
Distributed mixed FFT / vertical tridiagonal solver
This PR builds off #2536 and implements a distributed Poisson solver that users horizontal FFTs and a vertical tridiagonal solve, with more help from @jipolanco.
When distributed in (x, y), this kind of solver is more expensive than a pure FFT-based solver, because it requires 4 additional transpositions + communication.
For problems that are only distributed in x or y (eg, slab decomposition), we can avoid the additional transpositions. ~~Implementing that optimization is TODO for this PR.~~
Some of the details are discussed on https://github.com/jipolanco/PencilFFTs.jl/issues/44.
Future work, which would require abstracting the implementation of hydrostatic pressure in NonhydrostaticModel (and, for friendliness, forbidding the use of VerticallyImplicitTimeDiscretization), could in principle support a more efficient version of this solver with pencil decomposition in (y, z) or (x, z). This memory layout would increase performance for very large problems that require a 2D domain decomposition, since decomposing in (y, z) or (x, z) reduces the number of transposes needed by 4 over (x, y). This feature is easy to code, but might take some time to test. We've already noticed on #1910 that lumping hydrostatic and nonhydrostatic pressure produces different (perhaps lower quality) solutions.
TODO:
- [x] Implement a more efficient algorithm for 1D "slab" decompositions
- [x] Add tests
@jipolanco I just send up another big commit --- I realized that we could use extra_dims when we use a 1D process grid + tridiagonal to save much communication:
https://github.com/CliMA/Oceananigans.jl/blob/e1cac85ff8fdd9032549b1a3c32569bc71a92c1e/src/Distributed/distributed_fft_based_poisson_solver.jl#L307-L314
and
https://github.com/CliMA/Oceananigans.jl/blob/e1cac85ff8fdd9032549b1a3c32569bc71a92c1e/src/Distributed/distributed_fft_based_poisson_solver.jl#L162-L183
Let me know what you think.
@jipolanco I just send up another big commit --- I realized that we could use
extra_dimswhen we use a 1D process grid + tridiagonal to save much communication:
I can't look at the details right now, but that sounds like a good option. To be honest, I haven't really used extra_dims and I was thinking about actually removing it :smile: But if you find it useful then we'll keep it there. Let me know if you find any issues.
@jipolanco I just send up another big commit --- I realized that we could use
extra_dimswhen we use a 1D process grid + tridiagonal to save much communication:I can't look at the details right now, but that sounds like a good option. To be honest, I haven't really used
extra_dimsand I was thinking about actually removing it 😄 But if you find it useful then we'll keep it there. Let me know if you find any issues.
Ok! It seems convenient for 1D process grids / slab decomposition that use a tridiagonal solve along the third dimension rather than a transform. But we'll see...
@simone-silvestri this may pass
@glwagner it seems the only reason the tests aren't passing is because mpiexecjl isn't properly linked:
/bin/bash: /storage5/buildkite-agent/.julia-7523/bin/mpiexecjl: No such file or directoryÂ
Maybe fix that and merge since (apparently) this PR is otherwise ready to go?
There's a less trivial error here: https://buildkite.com/clima/oceananigans/builds/7523#311535ff-f56d-410e-8571-c3b1d9757daf
I'll try to restart the whole build and see what happens.
Just realized the distributed tests have been running for 6 days. I guess it's fair to say there's still something to fix lol
Just killed it to save resources
@tomchor are you able to test locally? I believe these passed locally for me, so the problem might be relatively easy to solve.
@tomchor are you able to test locally? I believe these passed locally for me, so the problem might be relatively easy to solve.
I've never tested anything in parallel locally, but I can definitely try
@glwagner I ran the tests and they got stuck in the same place where this test got stuck. So it appears that there's something to be fixed here...
@glwagner how do you run locally? do you use mpiexecjl?