oneDAL icon indicating copy to clipboard operation
oneDAL copied to clipboard

MPI GPU interface refactoring

Open ethanglaser opened this issue 1 year ago • 45 comments

Description

Changes proposed in this pull request:

  • Add virtual get_mpi_offload_support function to base communicator - defaults to false in nearly all cases
  • Add logic to get_mpi_offload_support function in mpi/communicator.h to check mpi libs for correct symbol and determine if level zero is supported
  • Add conditional in detail/communicator.cpp that uses result of get_mpi_offload_support to determine whether to convert data to host (previous default) or leave as is (yields performance improvements if GPU offload support in MPI)
  • Modify sendrecv_replace args to include optional additional buffer to accommodate MPICH workaround to call sendrecv with 2 GPU buffers

ethanglaser avatar Nov 14 '23 21:11 ethanglaser