easybuild-easyconfigs icon indicating copy to clipboard operation
easybuild-easyconfigs copied to clipboard

{cae}[intel/2019b] FreeFEM v4.5 (WIP)

Open boegel opened this issue 5 years ago • 12 comments

(created using eb --new-pr) requires https://github.com/easybuilders/easybuild-easyblocks/pull/1969

still WIP because although the installation works, I'm hitting errors like this when trying a non-trivial example: Intel MKL FATAL ERROR: Cannot load symbol MKLMPI_Get_wrappers (cfr. https://github.com/easybuilders/easybuild-framework/issues/1673)

boegel avatar Feb 20 '20 21:02 boegel

Travis test report: 2/2 runs failed - see https://travis-ci.org/easybuilders/easybuild-easyconfigs/builds/653180385

Only showing partial log for 1st failed test suite run 19961.1; full log at https://travis-ci.org/easybuilders/easybuild-easyconfigs/jobs/653180386

...
ERROR: test__parse_easyconfig_FreeFEM-4.5-intel-2019b.eb (test.easyconfigs.easyconfigs.EasyConfigTest)
Test for parsing of easyconfig FreeFEM-4.5-intel-2019b.eb
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<string>", line 1, in innertest
  File "/home/travis/build/easybuilders/easybuild-easyconfigs/test/easyconfigs/easyconfigs.py", line 775, in template_easyconfig_test
    ecs = process_easyconfig(spec)
  File "/home/travis/virtualenv/python2.7.15/lib/python2.7/site-packages/easybuild/framework/easyconfig/easyconfig.py", line 1802, in process_easyconfig
    raise EasyBuildError("Failed to process easyconfig %s: %s", spec, err.msg)
EasyBuildError: "Failed to process easyconfig /home/travis/build/easybuilders/easybuild-easyconfigs/easybuild/easyconfigs/f/FreeFEM/FreeFEM-4.5-intel-2019b.eb: No software-specific easyblock 'EB_FreeFEM' found for FreeFEM"

======================================================================
FAIL: test_changed_files_pull_request (test.easyconfigs.easyconfigs.EasyConfigTest)
Specific checks only done for the (easyconfig) files that were changed in a pull request.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/easybuilders/easybuild-easyconfigs/test/easyconfigs/easyconfigs.py", line 743, in test_changed_files_pull_request
    self.assertTrue(False, error_msg)
AssertionError: Failed to find parsed easyconfig for FreeFEM-4.5-intel-2019b.eb (and could not isolate it in easyconfigs archive either)

----------------------------------------------------------------------
Ran 9169 tests in 515.173s

FAILED (failures=1, errors=1)
ERROR: Not all tests were successful.

*bleep, bloop, I'm just a bot (boegelbot v20180813.01)*Please talk to my owner @boegel if you notice you me acting stupid),or submit a pull request to https://github.com/boegel/boegelbot fix the problem.

boegelbot avatar Feb 20 '20 21:02 boegelbot

I think the Cannot load symbol MKLMPI_Get_wrappers issue stems from the PETSc dependency, where it's linking to libmkl_blacs* for which there's only a static library (.a), not a dynamic library (.so), which is exactly the issue being discussed at https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302 .

See also https://lists.mcs.anl.gov/pipermail/petsc-dev/2016-January/018651.html, it seems to be a known but unresolved issue in PETSc?

boegel avatar Feb 20 '20 21:02 boegel

I think the Cannot load symbol MKLMPI_Get_wrappers issue stems from the PETSc dependency, where it's linking to libmkl_blacs* for which there's only a static library (.a), not a dynamic library (.so), which is exactly the issue being discussed at https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302 .

See also https://lists.mcs.anl.gov/pipermail/petsc-dev/2016-January/018651.html, it seems to be a known but unresolved issue in PETSc?

I think this issue can be fixed by explicitly linking the MPI libraries, but I haven't faced it for a while. I'm sorry I have a hard time deciphering the EasyBuild logs, where is this error showing exactly, please?

prj- avatar Feb 21 '20 07:02 prj-

Eh, there are .so versions of all the libmkl_blacs_* so i'm not sure what you mean by this.

akesandgren avatar Feb 21 '20 07:02 akesandgren

I think the Cannot load symbol MKLMPI_Get_wrappers issue stems from the PETSc dependency, where it's linking to libmkl_blacs* for which there's only a static library (.a), not a dynamic library (.so), which is exactly the issue being discussed at https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302 . See also https://lists.mcs.anl.gov/pipermail/petsc-dev/2016-January/018651.html, it seems to be a known but unresolved issue in PETSc?

I think this issue can be fixed by explicitly linking the MPI libraries, but I haven't faced it for a while. I'm sorry I have a hard time deciphering the EasyBuild logs, where is this error showing exactly, please?

@prj- You won't find the Cannot load symbol MKLMPI_Get_wrappers errors in the Travis logs (since that only runs the test suite, it doesn't actually try to install or use the software that the easyconfig (.eb) file installs).

The error also doesn't pop up at all during the installation, only when running a non-trivial example with mpirun FreeFem++-mpi test.edp.

Do you have more information on what you mean by "explicitly linking the MPI libraries"?

boegel avatar Feb 21 '20 10:02 boegel

Which example? I can deactivate ScaLAPACK within MUMPS just as Stefano did so that the error will go away, I think. Concerning the explicit linking, I remember that one fix I found for this problem was to just put the libmpi.a before the MKL BLACS/ScaLAPACK library. But this may be system specific.

prj- avatar Feb 21 '20 10:02 prj-

@prj- I need to check whether I can share this specific input file (it's not mine), I'll get back to you on that...

I don't think forcibly linking to libmpi.a will help (and if it does, whether it's a good solution), since the Cannot load symbol MKLMPI_Get_wrappers seems to be caused by linking to a static library while the runtime expects a dynamic library to be found (see https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302).

boegel avatar Feb 21 '20 10:02 boegel

@prj- I need to check whether I can share this specific input file (it's not mine), I'll get back to you on that...

If it's not possible:

  1. just looking at the sparams or the command-line arguments should suffice
  2. you can try to bypass the problem yourself with the additional argument -mat_mumps_icntl_13 1

I don't think forcibly linking to libmpi.a will help (and if it does, whether it's a good solution), since the Cannot load symbol MKLMPI_Get_wrappers seems to be caused by linking to a static library while the runtime expects a dynamic library to be found (see https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302).

I agree it's not a good solution, it's just the solution I came up with when I last encountered this issue.

prj- avatar Feb 21 '20 10:02 prj-

@prj- I need to check whether I can share this specific input file (it's not mine), I'll get back to you on that...

If it's not possible:

1. just looking at the `sparams` or the command-line arguments should suffice

2. you can try to bypass the problem yourself with the additional argument `-mat_mumps_icntl_13 1`

Is that an argument to the FreeFem++-mpi command, or should I change the input file somehow?

There is no sparams in the input file, no mention of arg either.

boegel avatar Feb 21 '20 10:02 boegel

You can just supply that on the command line. You can also add -ksp_view to make sure that in the log you have something like:

              ICNTL(13) (efficiency control):                         1

(I'll submit a PETSc MR, this is not the correct text inbetween the parenthesis)

prj- avatar Feb 21 '20 10:02 prj-

You can just supply that on the command line. You can also add -ksp_view to make sure that in the log you have something like:

              ICNTL(13) (efficiency control):                         1

(I'll submit a PETSc MR, this is not the correct text inbetween the parenthesis)

I tried this, but I'm hitting the exact same Intel MKL FATAL ERROR: Cannot load symbol MKLMPI_Get_wrappers. issue...

mpirun -np 12 FreeFem++-mpi cube.edp -mat_mumps_icntl_13 1 -ksp_view

Where can I find the log file you're referring to?

The input file I'm using is this one: https://gist.github.com/boegel/ae7b69b2651cade9c58f6c4a6290fe3e

boegel avatar Feb 21 '20 10:02 boegel

Oh, I thought you were using MUMPS through PETSc, not MUMPS directly. You can forget about my options then. Maybe if you load "PETSc" just before MUMPS, it will link to the appropriate wrappers? Alternatively, I can give you the MUMPS + PETSc version of this script, so that you can bypass the error as originally proposed (not to mention that it will be more efficient).

prj- avatar Feb 21 '20 11:02 prj-

closing since this has gone stale

jfgrimm avatar Jun 21 '23 11:06 jfgrimm