MPI Cuda-Aware enabled in Hypre ?
Hi,
My C++ code is built with PETSc 3.23 and Hypre 2.33, on top of OpenMPI with Cuda support enabled. It is running fine on multi-GPU, with MPI comunications between devices in the C++ part, in the PETSc part, but not in the Hypre part. When profiling with Nsight system, we saw D2H and H2D copy near the MPI calls:
Blue: PCApply/spmv_fixup_kernel_v2 Red: D2H copy (8000 bytes) Grey: MPI_irecv Grey: MPI_isend Grey: MPI_Waitall// Green: H2D copy (8000 bytes) Blue: PCApply/csmv_v2_partition_kernel
Hypre is configured during PETSc build with --enable-gpu-aware-mpi, and in the Hypre_config.h file, there is HYPRE_USING_GPU_AWARE_MPI defined.
Cuda support is well enabled in used MPI:
ompi_info --parsable --all 2>/dev/null | grep mpi_built_with_cuda_support:value:true
mca:mpi:base:param:mpi_built_with_cuda_support:value:true
What could be the reason, that D2D copy are done in my code, PETSc, but not into Hypre ?
The Hypre preconditioner (Boomeramg) is used through PETSc with KSP solver:
-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.7 -mat_type mpiaijcusparse -vec_type mpicuda
Thanks for any tips,
Pierre
Hi Pierre, are you able to share a reproducer (MWE)? I assume you had -use_gpu_aware_mpi 1, but just checking
Hi Victor,
Thanks for your help, sure I can share a PETSc reproducer. The Nsight profile above is from it:
You would need to build PETSC 3.23.2 with Hypre 3.23 and run ex64 from (share/petsc/examples/src/ksp/ksp/tutorials)
mpirun -np 2 ./ex46 -da_grid_x 1000 -da_grid_y 1000 -ksp_monitor -ksp_type cg -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.7 -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -use_gpu_aware_mpi 1
You would compare with PETSc algebric multigrid preconditioner where no copy D2H and H2D happens (except small 8 bytes for residual) during KSPSolve:
mpirun -np 2 ./ex46 -da_grid_x 1000 -da_grid_y 1000 -ksp_monitor -ksp_type cg -pc_type gamg -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -use_gpu_aware_mpi 1
AFAIK, -use_gpu_aware_mpi 1 is the default in PETSc if MPI is GPU-Aware. I added it though.
Hope it will helps,
Pierre