mpich icon indicating copy to clipboard operation
mpich copied to clipboard

Installing 3.4.x fails with OS X + cmake + Fortran

Open leofang opened this issue 4 years ago • 5 comments

Hello, not sure if this is a duplicate of #5004: We (Conda-Forge) experienced several failures when downstream packages try to build MPI support against MPICH 3.4.x (we tested 3.4.0 and 3.4.1). The failure only happens with the following conditions:

  • On OS X, non-M1 builds
  • Using cmake
  • Needs Fortran support

The build fails at cmake's FindMPI, here's a snippet of the error log:

-- Found MPI_C: $PREFIX/lib/libmpi.dylib (found version "3.1") 
-- Could NOT find MPI_Fortran (missing: MPI_Fortran_WORKS) 
-- Could NOT find MPI (missing: MPI_Fortran_FOUND) (found version "3.1")
    Reason given by package: MPI component 'CXX' was requested, but language CXX is not enabled.  

CMake Error at CMakeLists.txt:46 (message):
  MPI not found, specify the MPI Fortran compiler with MPI_Fortran_COMPILER
  variable

Our workaround is to rebuild MPICH with an additional configure flag --disable-opencl, and so far all downstream package maintainers reported it works, which is why I was wondering if this is related to #5004.

Ref: https://github.com/conda-forge/mpich-feedstock/issues/56

leofang avatar Feb 03 '21 16:02 leofang

The cmake log tells nothing about what went wrong during mpich build.

hzhou avatar Feb 03 '21 16:02 hzhou

Our workaround is to rebuild MPICH with an additional configure flag --disable-opencl, and so far all downstream package maintainers reported it works, which is why I was wondering if this is related to #5004.

Based on your experience, I would guess that it is related. Updating the hwloc submodule is on a shortlist of issues I plan to address for an upcoming 3.4.2.

raffenet avatar Feb 03 '21 16:02 raffenet

The cmake log tells nothing about what went wrong during mpich build.

@hzhou Unfortunately there's not much I can share other than directing you to the full failed CI log (see below). We build MPICH for Linux (x86-64, ppcle64, aarch64) and Mac OS X (x86-64, arm64) and only this particular combination fails.

If it helps, this is the recipe for how we currently build MPICH (with the workaround flag added at line 107): https://github.com/conda-forge/mpich-feedstock/blob/master/recipe/build-mpi.sh (it has some Conda-Forge specific things mixed in, but otherwise clear)

This is a CI log for one such failed builds reported by downstream maintainers: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=271731&view=logs&j=b4588902-138a-5967-ecc7-b3fc381bfda2&t=5a7a20e7-b634-5369-ebb8-6b51f51eb32a

leofang avatar Feb 03 '21 16:02 leofang

Upstream report to cmake: https://gitlab.kitware.com/cmake/cmake/-/issues/21741

isuruf avatar Feb 14 '21 23:02 isuruf

I believe the dependency is picked up by hwloc. hwloc currently will pick a few libraries including cuda, opencl, nvml. I suspect mpich don't really depend on the features. @raffenet , should we disable all these during configure of the embedded hwloc?

hzhou avatar Mar 18 '21 01:03 hzhou