pyoptsparse icon indicating copy to clipboard operation
pyoptsparse copied to clipboard

URGENT: Open MPI failure

Open alicebain opened this issue 1 year ago • 2 comments

Description

Hi all, after an update of our cluster from CentOS 7 to CentOS Stream 9 I noticed a problem using pyoptsparse. As I am in the middle of some urgent calculations, I would very much appreciate your help.

Before (= CentOS 7), everything was working fine. But now (= Stream 9) after importing pyoptsparse I get the following messages, which lead to very strange segmentation faults upon minimizations:

Python 3.10.14+ (heads/3.10:83518b3, Mar 22 2024, 18:23:14) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyoptsparse
cluster01:pid2860605.python: Failed to get bond0 (unit 0) cpu set
cluster01:pid2860605: PSM3 can't open nic unit: 0 (err=23)
cluster01:pid2860605.python: Failed to get bond0 (unit 0) cpu set
cluster01:pid2860605: PSM3 can't open nic unit: 0 (err=23)
cluster01:pid2860605.python: Failed to get bond0 (unit 0) cpu set
cluster01:pid2860605: PSM3 can't open nic unit: 0 (err=23)
cluster01:pid2860605.python: Failed to get bond0 (unit 0) cpu set
cluster01:pid2860605: PSM3 can't open nic unit: 0 (err=23)
--------------------------------------------------------------------------
Open MPI failed an OFI Libfabric library call (fi_endpoint).  This is highly
unusual; your job may behave unpredictably (and/or abort) after this.

  Local host: cluster01
  Location: mtl_ofi_component.c:513
  Error: Invalid argument (22)
--------------------------------------------------------------------------

It seems to be related to MPI, which I do not use as I am running on one node. Do you have any idea, what I can do to fix this? Thanks so much!

Code versions

  • Operating System: CentOS Stream 9
  • Python: 3.10.14+
  • OpenMPI: 201511 (v5)

alicebain avatar Mar 24 '24 22:03 alicebain

This is definitely an MPI issue, and probably unrelated to pyoptsparse. As you say you are not using MPI, you could try to run pyoptsparse without MPI support. This should be possible by removing the mpi4py package from your environment (see https://mdolab-pyoptsparse.readthedocs-hosted.com/en/latest/advancedFeatures.html#mpi-handling)

whophil avatar Mar 25 '24 00:03 whophil

Hello @alicebain, I agree with Phil, this looks related to MPI and I bet you will have the same issues if you directly import mpi4py. Setting the PYOPTSPARSE_REQUIRE_MPI flag as suggested in the linked documentation above (e.g. export PYOPTSPARSE_REQUIRE_MPI="no" ) is the recommended approach to deal with this. Let us know if you are still encountering issues.

marcomangano avatar Mar 25 '24 16:03 marcomangano

Closing due to inactivity.

ewu63 avatar Apr 18 '24 20:04 ewu63