ompi
ompi copied to clipboard
--with-pmix=internal should be on by default
A vanilla configure invocation (e.g. ../configure CC=clang CXX=clang++ --prefix=/opt/ompi) leads to this error:
configure: ===== done with 3rd-party/openpmix configure =====
checking for pmix pkg-config name... pmix
checking if pmix pkg-config module exists... yes
checking for pmix pkg-config cflags... -I/opt/homebrew/Cellar/open-mpi/4.1.5/include
checking for pmix pkg-config ldflags... -L/opt/homebrew/Cellar/open-mpi/4.1.5/lib
checking for pmix pkg-config static ldflags... -L/opt/homebrew/Cellar/open-mpi/4.1.5/lib
checking for pmix pkg-config libs... -lpmix -lz
checking for pmix pkg-config static libs... -lpmix -lz
checking for pmix.h... no
configure: error: Could not find viable pmix build.
The solution is to add --with-pmix=internal.
I've discussed this with @devreal, who provided the above fix.
This has been true of the latest development version for the past few months and it does not depend on the compilers or system I am using, so I am not bothering to provide that.
That is unlikely to happen (since this is an intentional design that has been extensively discussed internally before).
I'd rather suggest @devreal to add a pmix formula into HomeBrew, and have Open MPI use it.
Can you please confirm there is a pmix.pc somewhere in/opt/homebrew/Cellar/open-mpi/4.1.5 ?
If so, I am more happy to consider this file should probably not be there (I suspect the embedded PMIx installs pmix.pc and that is wrong since pmix.h and friends are not installed).
For the record: I have no business with homebrew :)
From a user's perspective, it does seem irritating that configure would fail without considering the internal pmix and does not provide a hint that --with-pmix=internal should be provided.
If Homebrew is the problem here, then I will just uninstall OMPI from Homebrew, because the whole point of building OMPI from source is to use it.
The fact that OMPI devs believes that it is reasonable for ./configure && make && make install to fail and for users to have to ask their OMPI developer friends for help in order to build it from source is ridiculous. Can you name another major OSS project with such a broken out-of-the-box user experience?
Removing Homebrew OMPI fixes this, but I still object to the fact that configure insists on using an external build before an internal one, when this fails. OMPI should either default to internal or - at least very least - figure out if the external PMIX it insists on using actually works.
Well well ...
configure currently fails because what I consider is a busted environment.
(feel free to blame Open MPI as much as you wish for indirectly causing that)
Could configure have done a better job at detecting such a bozzo case and falling back to the internal PMIx?
sure, PR are welcome.
meanwhile, a trivial fix is to
rm /opt/homebrew/Cellar/open-mpi/4.1.5/lib/pkgconfig/pmix.pc
I suspect a busted environment can cause configure to fail on any OSS project.
@devreal I am sorry I misread your handle and did not realize you are a friend of ours.
The Homebrew installation of OMPI works fine in every way, so I don't know how you can describe it as busted. It's just not usable for satisfying the dependencies of OMPI when building from source, which is 100% reasonable for a packaged build.
I furthermore reject the assertion that having an OMPI package build on my system constitutes a bozo use case for building OMPI from source. You understand that I write MPI libraries and need to test against multiple versions of MPI, including both the ones users have most often (e.g. Homebrew builds) and the latest and greatest dev build, so I catch new bugs as soon as they appear?
I'm not going to work on a PR until it's clear that OMPI devs accept that this is a problem that needs to be fixed. Otherwise, they might just reject the PR as unneeded.
At this stage, I believe the embedded PMIx built by Open MPI should not install pmix.pc. I consider it as a bug, and it should be fixed.
Fast forward, pkg-config does find this it but since pmix.h and friends are not installed, PMIx is unusable and so I consider the environment has been busted.
By bozo use case, I meant an existing but unusable pmix.pc.
Having multiple Open MPI installed on your system (including in the default location(s)) is supposed to work.
FWIW, this is the fix I have in mind. If you choose to apply it, you will need to have recent autotools installed and manually run autopen.pl (or autopen.pl --force if you build from a tarball)
diff --git a/opal/mca/pmix/pmix3x/pmix/Makefile.am b/opal/mca/pmix/pmix3x/pmix/Makefile.am
index 11f9918e98..c07fa73411 100644
--- a/opal/mca/pmix/pmix3x/pmix/Makefile.am
+++ b/opal/mca/pmix/pmix3x/pmix/Makefile.am
@@ -62,4 +62,7 @@ dist-hook:
env LS_COLORS= sh "$(top_srcdir)/config/distscript.sh" "$(top_srcdir)" "$(distdir)" "$(PMIX_VERSION)" "$(PMIX_REPO_REV)"
pkgconfigdir = $(libdir)/pkgconfig
+pkgconfig_DATA =
+if ! PMIX_EMBEDDED_MODE
pkgconfig_DATA += maint/pmix.pc
+endif
@jsquyres generally speaking, should we try detect such busted environments, and if yes, should we fall back to the internal PMIx or abort after suggesting to configure --with-pmix=internal?
Or should we simply improve the final error message to suggest the use of --with-pmix=internal?
Thanks!
I'd be content with this:
Or should we simply improve the final error message to suggest the use of --with-pmix=internal
although the more automated solutions are great.
@ggouaillardet
- I think your patch seems reasonable. You'll also push this upstream to PMIx? Can you see if the same thing is happening on
mainandv5.0.x, too? - I'd say that
configuredid detect the busted external PMIx environment properly (i.e., where an OMPI install had erroneously installed apmix.pc).configurereacted appropriately by choosing not to use the external PMIx. Adding a helper message would be nice, but more generally: why didn'tconfigureautomatically fall back to using the internal PMIx when the external one was not suitable?
@jeffhammond
A little rationale on why Open MPI "prefers" external libraries: the reason we still bundle some libraries in Open MPI is complex -- these days, it's probably a mix of some lingering technical issues (e.g., guaranteed compatibility), but also a healthy dose of user experience issues (e.g., universal out-of-box build experience on systems such as MacOS, especially when outside of systems such as Homebrew). Regardless, bundling dependent libraries is uncommon. As such, Open MPI first looks around the system for non-bundled versions of these libraries. If we find a suitable one, we use that rather than the bundled version.
I think your situation fell into a case that ultimately exposed a bug in Open MPI (that we shouldn't install pmix.pc), which we definitely need to fix.
@jsquyres I will suggest the patch, but I do not expect a new release of PMIx 3 will be released, so we might have to do it locally.
v5 is different in a sense we install the full blown PMIx (and hwloc fwiw) so it would install both pmix.pc and the header files. On one hand, the environment would not be busted, but on the other hand, Open MPI (compiled with internal PMIx and hwloc) would add both PMIx and hwloc to the environment, which might not be desirable ... but this is an other story.
I do not know why configure does not automatically fall back to using the internal PMIx in this case.
That being said, is this a bug or a feature? (this is a genuine question, since I can argue for both views)
I doubt we'd be accepting any changes regarding EMBEDDED_MODE as PMIx no longer supports that method in any active release branch. Another approach to solving this problem might be to just fix the "internal" configure logic so it ignores any .pc file it finds as it shouldn't be using it for any embedded package, not just PMIx. The package logic was strictly for use when finding an externally installed version.
for openmpi v5.0.8,it seems that this issue still exists. Before I found this issue, I install PMIx by my self to resolve this.