candi icon indicating copy to clipboard operation
candi copied to clipboard

Compiler cmake issue

Open koecher opened this issue 3 years ago • 22 comments

This is a PR Draft to figure out the problems we have with setting the compiler and cmake

Updated packages:

  • Trilinos 13.0.1
  • (optional SuperLU_dist 6.4.0 seems to be working with trilinos 13.0.1)

Developments

  • New detection of BLAS, LAPACK, MKL, openBLAS in trilinos.package
  • removes explicit compiler settings for cmake in trilinos.package

Successful tests on freshly set up virtual machines with the recommended installation instructions @koecher

  • [x] ubuntu 21.04, gcc-10, openmpi from OS, cmake 3.18 from OS, candi sets CC,CXX,etc to mpi compilers
  • [x] ubuntu 21.04, gcc-10, openmpi from OS, cmake 3.20.5 from candi (unrelated superlu problem for deal.II / linking blas)
  • [x] CI / ubuntu 18.04 minimal (pull_request) Successful
  • [x] CI / ubuntu 20.04 (pull_request) Successful
  • [x] centos 7, gcc-8.1.0/mpich-3.2 self-compiled with all env. variables set, cmake 3.20.5 from candi, Successful
  • [x] ubuntu 20.04 (LTS), gcc-9, openmpi from OS, cmake 3.16.3 from OS, Successful
  • [x] ubuntu 20.04 (LTS), gcc-9, openmpi from OS, cmake 3.20.5 from candi, Successful
  • [ ] ubuntu 20.04 (LTS), gcc-10 alternative, openmpi from OS, cmake 3.16.3 from OS
  • [ ] ubuntu 20.04 (LTS), gcc-10 alternative, openmpi from OS, cmake 3.20.5 from candi
  • [x] opensuse 15 / Leap, gcc-7.5, mpich from OS, cmake 3.17.0 from OS, Successful
  • [x] CI / OSX gcc (pull_request) Has Failure in a test after install, seems unrelated
  • [x] CI / OSX clang (pull_request), Successful
  • [x] macOS 11.4.1 (bigsur), gcc-11, openmpi, cmake 3.21.0 from homebrew, Successful

@gfcas

  • [x] Ubuntu 18.04 (LTS), gcc-7.5, openmpi from OS, cmake 3.20.5 from OS
  • [x] Ubuntu 20.04 (LTS), gcc-9.3, openmpi from OS, cmake 3.20.5 from OS
  • [x] Ubuntu 20.04 (LTS), gcc-10.3, openmpi from OS, cmake 3.20.5 from OS, succeded, https://github.com/gfcas/candi/actions/runs/1022810879

Clean ups and TODOs

  • remove local.cfg commit
  • Trilinos configuration variables in candi
  • BLAS and LAPACK dirs
  • TODOs in trilinos.package
  • SuperLU_dist BLAS/LAPACK linking problem (only for v7.0.0), resolved

koecher avatar Jul 09 '21 17:07 koecher

Related to #207 #159

koecher avatar Jul 09 '21 17:07 koecher

@gfcas can you test this branch on your machines?

koecher avatar Jul 09 '21 18:07 koecher

@gfcas can you test this branch on your machines?

@koecher of course, I will also reconfigure the github actions runners. On my personal laptop maybe it will take some days because I got a new one.

ghost avatar Jul 12 '21 08:07 ghost

I think it very good that we update the Trilinos package, which is one of the major ones for parallel computing. Should we include also Umfpack and Mumps as described in the dealii readme (https://www.dealii.org/current/external-libs/trilinos.html). For my own work I was recently interested in the trilinos subpackage ShyLU.

ghost avatar Jul 12 '21 08:07 ghost

I think it very good that we update the Trilinos package, which is one of the major ones for parallel computing. Should we include also Umfpack and Mumps as described in the dealii readme (https://www.dealii.org/current/external-libs/trilinos.html). For my own work I was recently interested in the trilinos subpackage ShyLU.

Let us concentrate on Trilinos and deal.II for now to find a solution to the problem

koecher avatar Jul 12 '21 10:07 koecher

@gfcas no reviews so far, everything is draft for testing

koecher avatar Jul 12 '21 11:07 koecher

@koecher:

  • [x] Ubuntu 18.04 (LTS), gcc-7.5, openmpi from OS, cmake 3.20.5 from OS
  • [x] Ubuntu 20.04 (LTS), gcc-9.3, openmpi from OS, cmake 3.20.5 from OS
  • [x] Ubuntu 20.04 (LTS), gcc-10.3, openmpi from OS, cmake 3.20.5 from OS just succeded, see https://github.com/gfcas/candi/actions/runs/1022810879

ghost avatar Jul 12 '21 14:07 ghost

I've cleaned up the discussion. Please only report on the compiler-cmake issue. SuperLU is not of interest anymore here

koecher avatar Jul 12 '21 15:07 koecher

In order to get this issue ready for v9.3.1 maybe we should consistently test it with DEAL_II_VERSION=v9.3.1?

ghost avatar Jul 28 '21 11:07 ghost

I think we need a test for intel compilers, the other systems look good.

Testing with the fixed v9.3.1 is okay.

koecher avatar Jul 28 '21 15:07 koecher

Remarks for macOS:

Packages:

  • parmetis 4.0.3
  • hdf5 1.10.7
  • superlu_dist 6.4.0
  • p4est 2.2
  • trilinos 13.0.1 (without compiler settings, without -lgfortran)
  • deal.II v9.3.1 (without compiler settings)

Special configuration: LC_ALL=C and unset LANGUAGE since step-32 doesn't work without it

export OMPI_CC=gcc-11
export OMPI_CXX=g++-11
export OMPI_FC=gfortran-11
export CC=mpicc
export CXX=mpicxx
export FC=mpifort
export FF=mpifort
export LC_ALL=C
unset LANGUAGE

Tested step tutorials

  • step-1
  • step-7
  • step-29 (using deal.II complex and umfpack as direct solver)
  • step-32 (needs LC_ALL=C)
  • step-33
  • step-40 (only Trilinos installed)
  • step-72
  • hdf5 file output with own software

Additional system preferences: (if dylib library not found, e.g. for TrilinosWrappers::SolverDirect)

  • System Preferences / Security & Privacy / Developer Tools / Terminal

koecher avatar Jul 29 '21 13:07 koecher

Do you think it is a good time to go to Trilinos 13 at this point? I am not sure we have tested it much with deal.ii.

tjhei avatar Jul 29 '21 18:07 tjhei

Do you think it is a good time to go to Trilinos 13 at this point? I am not sure we have tested it much with deal.ii.

Well this was a try if things are going well so far. I think we should test the parallel features of trilinos 13 a little more. From the current tests on macos, I'm fine with trilinos 13.

koecher avatar Jul 29 '21 18:07 koecher

@tjhei do you have the chance to test this with intel compilers?

koecher avatar Jul 29 '21 18:07 koecher

Deal.II currently suggests using 12.18, see https://github.com/dealii/dealii/blob/94b2450484e130a74051c7bd7230c8d6f79b98b2/doc/external-libs/trilinos.html#L53 I don't think it is a good idea to go to 13 right now, especially because it does not fix any problem, or does it?

What do you want to me to test, just run with default settings with Intel and see if things compile?

tjhei avatar Jul 29 '21 18:07 tjhei

Deal.II currently suggests using 12.18, see https://github.com/dealii/dealii/blob/94b2450484e130a74051c7bd7230c8d6f79b98b2/doc/external-libs/trilinos.html#L53 I don't think it is a good idea to go to 13 right now, especially because it does not fix any problem, or does it?

What do you want to me to test, just run with default settings with Intel and see if things compile?

  • can you test if this PR is running smoothly? Here is a local.cfg to set the packages.
  • I suggest to stay with trilinos 12 for deal.II v9.3 and
  • introduce an additional trilinos13.package for master

This PR isn't useful for a merge, the things we learned here should be used for clean PRs in future

koecher avatar Jul 29 '21 18:07 koecher

@zjiaqi2018 can you please check out this PR of candi on palmetto or frontera and compile with current Intel compilers and MKL? Please set MKL=ON in the local.cfg. Let us know what changes you need to do or if you get any errors.

tjhei avatar Jul 29 '21 21:07 tjhei

@zjiaqi2018 can you please check out this PR of candi on palmetto or frontera and compile with current Intel compilers and MKL? Please set MKL=ON in the local.cfg. Let us know what changes you need to do or if you get any errors.

I tested it on frontera, and it seems to stop at: image

zjiaqi2018 avatar Jul 30 '21 15:07 zjiaqi2018

@zjiaqi2018 can you please check out this PR of candi on palmetto or frontera and compile with current Intel compilers and MKL? Please set MKL=ON in the local.cfg. Let us know what changes you need to do or if you get any errors.

I tested it on frontera, and it seems to stop at: image

with -D DEAL_II_COMPONENT_EXAMPLES=OFF, it doesn't work either: image

zjiaqi2018 avatar Jul 30 '21 16:07 zjiaqi2018

Is this on Frontera? Rene had to disable gold linker, see "-fuse" in https://github.com/geodynamics/aspect/wiki/Installation-on-Frontera

tjhei avatar Jul 30 '21 18:07 tjhei

Can you post the deal.ii summary.log and your changes to the settings once it works?

tjhei avatar Jul 30 '21 18:07 tjhei

It works now. For the local changes, I just follow Rene's instructions you mentioned. image

zjiaqi2018 avatar Jul 30 '21 19:07 zjiaqi2018