GEOS icon indicating copy to clipboard operation
GEOS copied to clipboard

[Bug] Non-MPI build broken

Open klevzoff opened this issue 3 years ago • 5 comments

Describe the bug Non-MPI build is broken and not tested in CI.

To Reproduce Steps to reproduce the behavior:

  1. Configure with -DENABLE_MPI=OFF
  2. Build geosx
  3. See errors
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:826:38: error: unused parameter 'buf' [-Werror,-Wunused-parameter]
int MpiWrapper::recv( array1d< T > & buf,
                                     ^
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:828:27: error: unused parameter 'tag' [-Werror,-Wunused-parameter]
                      int tag,
                          ^
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:855:55: error: unused parameter 'buf' [-Werror,-Wunused-parameter]
int MpiWrapper::iSend( arrayView1d< T const > const & buf,
                                                      ^
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:857:28: error: unused parameter 'tag' [-Werror,-Wunused-parameter]
                       int tag,
                           ^
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:906:34: error: unused parameter 'value' [-Werror,-Wunused-parameter]
U MpiWrapper::prefixSum( T const value, MPI_Comm comm )
                                 ^
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:906:50: error: unused parameter 'comm' [-Werror,-Wunused-parameter]
U MpiWrapper::prefixSum( T const value, MPI_Comm comm )
                                                 ^
/home/klevtsov/work/GEOSX-develop/src/coreComponents/common/MpiWrapper.hpp:925:82: error: unused parameter 'comm' [-Werror,-Wunused-parameter]
T MpiWrapper::reduce( T const & value, Reduction const MPI_PARAM( op ), MPI_Comm comm )

... etc.

Expected behavior

  • [ ] Non-MPI build should work
  • [ ] Non-MPI build should be tested in at least one CI configuration

Alternatively ENABLE_MPI option should be deprecated and removed.

klevzoff avatar Jul 26 '21 21:07 klevzoff

Do you think having one target for non-MPI builds is worth the price? The interest looks minimal. (Who really uses GEOSX without MPI? What would be the cost of having a non-working MPI version for some time?). The cost of fixing it sometimes may not be too expensive. We have a lot of different options that we could test first on our CI (petsc, PVT...)

TotoGaz avatar Jul 26 '21 22:07 TotoGaz

Honestly, I don't know which approach is right. I imagine prospective users checking out code and trying it out on their laptops for the first time might try it without MPI and be immediately turned away by compilation errors. Hence the alternative proposed solution of just abandoning that option altogether and making MPI a hard requirement. I don't like the idea of having Schroedinger's code that might be working or not.

klevzoff avatar Jul 27 '21 05:07 klevzoff

I don't like the idea of having Schroedinger's code that might be working or not.

Quality Assurance is always a matter of cost/risk/benefit. Globally, there will be cases that are not worth the price automating. And in the meantime, you may have to accept the risks (that you can mitigate by running manual testing for ex.).

I do not know whether MPI is worth automating. But I'm convinced that checking w.r.t. our different linear algebra backends would bring more benefits (and surely other options). Maybe other options too?

Having an MPI option at all is disputable for an exascale HPC software: what is the use case? Debugging?

TotoGaz avatar Jul 27 '21 18:07 TotoGaz

Developer meeting discussion:

  • Consensus towards removing as a standard option.
  • Can leave in the code as an "experimental" feature, rather than completely stripping out
  • Should move all cmake options into one place, and clearly label certain as "advanced/experimental"
  • Simply too much effort to maintain. Should move toward simplifying build options.
  • Important thing is to maintain the MPI wrapper, this is what makes this easy to re-enable in the future. Side note: should clean up code for any direct MPI_xxxx calls.

joshua-white avatar Jul 29 '21 17:07 joshua-white

The jury says delete non-MPI builds.

rrsettgast avatar Aug 19 '21 17:08 rrsettgast

@paveltomin it's not planned, but we still need to do it!

TotoGaz avatar Sep 20 '23 23:09 TotoGaz

@paveltomin it's not planned, but we still need to do it!

but jury said no non-mpi!

paveltomin avatar Sep 20 '23 23:09 paveltomin

but jury said no non-mpi!

Which means that we have to properly remove the possibility to try a non-MPI build.

TotoGaz avatar Sep 20 '23 23:09 TotoGaz