EKAT icon indicating copy to clipboard operation
EKAT copied to clipboard

Potential fix for Apple silicon build

Open mjs271 opened this issue 1 year ago • 3 comments

For the past few months (~beginning of November), I haven't been able to build EKAT successfully on my mac laptop with an M1 chip that's on macos Monterey. First, the EKAT version I've been using (661dbc52) is what's used in the EAGLES project by haero and, by extension, mam4xx.

I've been attempting to build with the following configuration flags,

-DCMAKE_CXX_COMPILER=mpic++
-DCMAKE_Fortran_COMPILER=mpifort
-DKokkos_ENABLE_DEPRECATED_CODE=OFF
-DKokkos_ENABLE_DEBUG=TRUE
-DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=OFF
-DKokkos_ENABLE_CUDA=OFF
-DKokkos_ENABLE_SERIAL=ON
-DEKAT_ENABLE_FPE=OFF

in which mpic++ is built with Apple clang v14 and mpifort with gfortran 12.2. When I make, the errors I've been seeing are of the type:

EKAT/src/ekat/util/ekat_feutils.hpp:xx:yy: error: no member named '__control' in 'fenv_t' [...]
EKAT/src/ekat/util/ekat_feutils.hpp:xx:yy: error: no member named '__mxcsr' in 'fenv_t' [...]

The apparent fix turns out to be a matter of adding an #ifdef statement around an #include in ekat_arch.cpp, namely

#ifdef EKAT_ENABLE_FPE
  #include "ekat/util/ekat_feutils.hpp"
#endif

After a successful build, I get the following from make test

95% tests passed, 4 tests failed out of 75

Label Time Summary:
MustFail    =   0.64 sec*proc (3 tests)

Total Test time (real) =  22.83 sec

The following tests FAILED:
	 53 - comm_np1 (Failed)
	 54 - comm_np2 (Failed)
	 55 - comm_np3 (Failed)
	 56 - comm_np4 (Failed)

And the failure log output for these tests indicates that this is expected on mac, noting

A request was made to bind a process, but at least one node does NOT
support binding processes to cpus.

Node: <node>

Open MPI uses the "hwloc" library to perform process and memory
binding. This error message means that hwloc has indicated that
processor binding support is not available on this machine.

On OS X, processor and memory binding is not available at all (i.e.,
the OS does not expose this functionality).

Given all of this, I am not sure if this is a tenable fix or whether there may be knock-on effects. I did want to put it on the EKAT team's radar, though.

@jeff-cohere

mjs271 avatar Jan 19 '23 21:01 mjs271