Castro
Castro copied to clipboard
add capability to use James solver for gravity BCs
AMReX now has a James BC solver:
https://github.com/AMReX-Codes/amrex/pull/2912
we should add the ability to use this instead of the multipole solver for BCs.
As a comparison point for testing how this affects performance for an apples-to-apples test, build Exec/science/wdmerger
with make USE_CUDA=TRUE USE_MPI=TRUE DIM=3 TINY_PROFILE=TRUE
. This is the test case I ran on Perlmutter across 4 GPUs:
srun nsys profile -f true -s none -o wdmerger_256_%q{SLURM_PROCID} ./Castro3d.gnu.TPROF.MPI.CUDA.ex inputs amr.n_cell = 256 256 256 max_step = 10 amr.max_grid_size = 64 amr.blocking_factor = 16 amr.plot_files_output = 0 amr.checkpoint_files_output = 0
(The interactive job was allocated with salloc -N 1 --gpus-per-task=1 --gpu-bind=map_gpu:0,1,2,3 --tasks-per-node=4 -t 120 --qos=interactive -A m3018_g -C gpu
).
This is what the profile looks like on the last timestep.
The gravity solve takes 79 ms, of which 24 ms is spent in the multipole BC and 55 ms is spent in the Poisson solve.
This is not the only relevant configuration to consider; at larger scale, gravity tends to dominate over hydro and the profile looks a bit different, making the BCs less important. But it's a useful starting point for analysis.