cuda-profiler
cuda-profiler copied to clipboard
nvtx_pmpi Fortran interface crashes when using MPI_IN_PLACE
nvtx_pmpi interfaces Fortran MPI_* calls to C PMPI_* calls itself, rather than leaving that step up to the underlying MPI library. Unfortunately it gets some things wrong in the process, in particular, handling special constants that are used instead of data pointers, like MPI_IN_PLACE and MPI_BOTTOM.
I have a workaround for OpenMPI/SpectrumMPI, but it's not general, and I'm not positive that it's possible to do this generically in the first place. Anyway, I guess the first question is whether there is interest in addressing this issue, if so, it'd be worth discussing options on how to do it.
I would propose to use https://github.com/LLNL/wrap/blob/fortran-fixes/wrap.py, which addresses this issue also for other MPI implementations.