mpich icon indicating copy to clipboard operation
mpich copied to clipboard

mpi4py: ABI build broken

Open dalcinl opened this issue 1 year ago • 4 comments

https://github.com/mpi4py/mpi4py-testing/actions/runs/9654833551/job/26629748498

Traceback (most recent call last):
  File "/home/runner/work/mpi4py-testing/mpi4py-testing/test/main.py", line 304, in <module>
    main(module=None)
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/unittest/main.py", line 104, in __init__
    self.parseArgs(argv)
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/unittest/main.py", line [13](https://github.com/mpi4py/mpi4py-testing/actions/runs/9654833551/job/26629748498#step:16:14)6, in parseArgs
    self._do_discovery([])
  File "/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/unittest/main.py", line 253, in _do_discovery
    self.createTests(from_discovery=True, Loader=Loader)
  File "/home/runner/work/mpi4py-testing/mpi4py-testing/test/main.py", line 268, in createTests
    setup_modules(self)
  File "/home/runner/work/mpi4py-testing/mpi4py-testing/test/main.py", line [18](https://github.com/mpi4py/mpi4py-testing/actions/runs/9654833551/job/26629748498#step:16:19)4, in setup_modules
    import mpi4py.MPI
ImportError: /usr/local/lib/libmpi_abi.so.0: undefined symbol: MPIX_Async_get_state

dalcinl avatar Jun 25 '24 07:06 dalcinl

We need to add the mpi-abi config into our CI review workflow

hzhou avatar Jun 25 '24 15:06 hzhou

I tried building MPICH locally (GCC 14.1.1) and make fails the following way:

../mpich/src/util/mpir_async_things.c: In function 'MPIR_Async_things_finalize':
../mpich/src/util/mpir_async_things.c:35:13: error: implicit declaration of function 'PMPIX_Stream_progress'; did you mean 'MPID_Stream_progress'? [-Wimplicit-function-declaration]
   35 |             PMPIX_Stream_progress(MPIX_STREAM_NULL);
      |             ^~~~~~~~~~~~~~~~~~~~~
      |             MPID_Stream_progress
make[2]: *** [Makefile:26348: src/util/lib_libmpi_abi_la-mpir_async_things.lo] Error 1

dalcinl avatar Jun 26 '24 07:06 dalcinl

ImportError: /usr/local/lib/libmpi_abi.so.0: undefined symbol: MPIX_Async_get_state

I try not to expose MPIX functions to the ABI build. Is this an issue with mpi4py?

../mpich/src/util/mpir_async_things.c:35:13: error: implicit declaration of function 'PMPIX_Stream_progress'; did you mean 'MPID_Stream_progress'? [-Wimplicit-function-declaration]

Will fix.

hzhou avatar Jun 26 '24 16:06 hzhou

I try not to expose MPIX functions to the ABI build. Is this an issue with mpi4py?

No, not at all. mpi4py does not use these MPIX APIs. I believe the mpi4py failure is closely related to the second one. Note that in the first traceback I reported, the error happens while loading libmpi_abi.so. The library itself has an undefined symbol, likely because of a missing extern/public declaration.

PS: In fact, the mpi4py ABI tests build mpi4py with the ABI stubs, then attempts to use the MPICH ABI build at runtime.

dalcinl avatar Jun 26 '24 16:06 dalcinl