Caliper icon indicating copy to clipboard operation
Caliper copied to clipboard

undefined reference to `PMPI_Accumulate_c' when using Intel oneAPI 2024.1

Open adam-sim-dev opened this issue 10 months ago • 8 comments

cmake -S . -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DWITH_MPI=On -DMPI_C_COMPILER=mpiicx -DMPI_CXX_COMPILER=mpiicpx -DBUILD_SHARED_LIBS=Off -DWITH_GOTCHA=Off -DCMAKE_INSTALL_PREFIX=./caliper -DCMAKE_INSTALL_LIBDIR=lib

cmake.log make.log

adam-sim-dev avatar Mar 29 '24 13:03 adam-sim-dev

There is no problem when I build Caliper using Intel oneAPI 2024.0. With oneAPI 2024.1, I can see, in /opt/intel/oneapi/mpi/latest/include/mpi.h, there is only

int MPI_Accumulate_c(const void *origin_addr, MPI_Count origin_count, MPI_Datatype origin_datatype,
                     int target_rank, MPI_Aint target_disp, MPI_Count target_count,
                     MPI_Datatype target_datatype, MPI_Op op, MPI_Win win)
                     MPICH_ATTR_POINTER_WITH_TYPE_TAG(1,3) MPICH_API_PUBLIC;

adam-sim-dev avatar Mar 29 '24 13:03 adam-sim-dev

Hi @adam-sim-dev, interesting, thanks for the report! What system are you on?

daboehme avatar Mar 29 '24 17:03 daboehme

Hi @adam-sim-dev, interesting, thanks for the report! What system are you on?

Ubuntu 22.04

adam-sim-dev avatar Mar 29 '24 19:03 adam-sim-dev

Okay, thanks. It sounds like a bug in oneAPI since every MPI_ function is supposed to have a corresponding PMPI_ function, but I'll see if I can work around it.

daboehme avatar Mar 29 '24 20:03 daboehme

This issue still exists for oneAPI 2024.2. If it is a bug in oneAPI, can we report it to Intel? (Sorry, I do not know what the specific problem is.) It will be good if there is a workaround in Caliper.

adam-sim-dev avatar Jun 25 '24 12:06 adam-sim-dev

Any progress on this issue? @daboehme

adam-sim-dev avatar Aug 29 '24 14:08 adam-sim-dev

Hi @adam-sim-dev, apologies for not getting back to this earlier. I do think it's an issue with oneAPI and it would be good to report it. Every MPI_ function should have an equivalent PMPI_ function but apparently they forgot to add one for MPI_Accumulate_c.

Any particular reason you're disabling Gotcha? The PMPI_ issue won't happen if you use Gotcha for wrapping MPI functions. Gotcha used to have some issues in particular with the Intel software stack, but there were several improvements in the latest versions that should fix these. Might be worth giving it a try again. Requires you to link MPI as a shared library though.

Generally a fix will probably require a manual workaround. If you're feeling adventurous you can hack src/services/mpiwrap/wrap.py and add MPI_Accumulate_c to the exclude_strings list.

daboehme avatar Aug 30 '24 00:08 daboehme

Any particular reason you're disabling Gotcha? The PMPI_ issue won't happen if you use Gotcha for wrapping MPI functions. Gotcha used to have some issues in particular with the Intel software stack, but there were several improvements in the latest versions that should fix these. Might be worth giving it a try again. Requires you to link MPI as a shared library though.

I can not remember why I set -DWITH_GOTCHA=Off, but I will have a try.

adam-sim-dev avatar Aug 30 '24 01:08 adam-sim-dev