mantid icon indicating copy to clipboard operation
mantid copied to clipboard

Abins: Use Euphonic Castep parser where available

Open ajjackson opened this issue 2 years ago • 2 comments

Description of work.

Abins reads data from ab-initio vibration/phonon calculations in a variety of formats, including .phonon files from CASTEP.

The inbuilt CASTEP parser makes the unsafe assumption that Gamma-points with LO-TO splitting, which use a longer header line to contain extra information, will always appear first. If this point appears later in the data, earlier k-points are split.

This assumption is quite deeply baked into the logic of the parser, requiring refactoring to address. As we intend in any case to increasingly depend on Euphonic, which contains a .phonon file parser /without/ this defect, this work would be redundant.

Here we

  • detect the unsafe behaviour in the inbuilt parser and raise an error
  • use Euphonic if possible

Not all Mantid builds include Euphonic yet, but when we move to Conda-based packaging for all builds the legacy implementation can be removed.

Abins is structured around the idea that each format has a class implementing the loader, and I have put both implementations into the same class (from which the appropriate method call is dispatched). For now, the use_euphonic flag can be used in testing to force one implementation or the other: when it is time to deprecate, the old implementation can be removed along with this parameter.

To complicate matters, there is also a (different) problem with Euphonic's handling of LO-TO-split Gamma points https://github.com/pace-neutrons/Euphonic/issues/216, but a) it is less severe than the current worst-case for Abins b) hopefully it will be improved soon.

Knowing that this change is coming has deterred me from adding lots of numerical tests of the Euphonic implementation...

To test: Unit tests and systemtests have been updated.

To really check it out you'd need to try running in environments with and without Euphonic installed. Results will slightly change if an LO-TO split Gamma-point is included in the data. (Let me know if you'd like that explained in more detail!)

Fixes #33755


Reviewer

Please comment on the following (full description):

Code Review
  • Is the code of an acceptable quality?
  • Does the code conform to the coding standards?
  • Are the unit tests small and test the class in isolation?
  • If there is GUI work does it follow the GUI standards?
  • If there are changes in the release notes then do they describe the changes appropriately?
  • Are the release notes saved in a separate file, using Issue or PR number for file name and in the correct location?
Functional Tests
  • Do changes function as described? Add comments below that describe the tests performed?
  • Do the changes handle unexpected situations, e.g. bad input?
  • Has the relevant (user and developer) documentation been added/updated?

Does everything look good? Mark the review as Approve. A member of @mantidproject/gatekeepers will take care of it.

ajjackson avatar Apr 12 '22 17:04 ajjackson

do we want to update this now we have moved to conda?

AnthonyLim23 avatar Jun 21 '22 07:06 AnthonyLim23

If we are 100% Conda builds now, then we could remove the legacy parser entirely which would be nice.

As mentioned, work is ongoing on the Euphonic parser. Most likely those changes will be part of Euphonic 1.x along with some API changes. (I pinned the Conda dependency to 0.6.x for a reason...) So it could make sense to merge this before beginning an "update to Euphonic 1.x (w/ API changes)" PR once that is ready.

ajjackson avatar Jun 21 '22 09:06 ajjackson

:wave: Hi, @ajjackson,

Conflicts have been detected against the base branch. Please rebase your branch against the base branch.


This message is automatically generated by prince-chrismc/label-merge-conflicts-action so don't hesitate to report issues/improvements there.

github-actions[bot] avatar Oct 13 '22 10:10 github-actions[bot]

This pull request has been automatically marked as stale because it has not had activity in 3 months. It will be closed in 7 days if no further activity occurs. Allowing pull requests to close as stale helps us filter out old work that is no longer relevant and helps developers focus on reviewing current work. All pull requests closed by this bot act like normal pull requests; they can be searched for, commented on or reopened at any point. If these changes are still relevant then please comment and tag @mantidproject/gatekeepers to highlight that it needs to have a reviewer assigned.

stale[bot] avatar Apr 19 '23 14:04 stale[bot]

Might close later but not stale today

ajjackson avatar Apr 19 '23 15:04 ajjackson

As far as I can tell there are no merge conflicts and it is simply unable to remove the label for permissions reasons. We've seen this before.

ajjackson avatar May 03 '23 16:05 ajjackson

The change to the squareicn_no_sum_LoadCASTEP test data is very subtle; only a trailing newline was changed (and, as a result, the expected checksum is completely different). This is because the handcrafted file is not valid input to the Euphonic reader (as it is slightly off-spec) but was tolerated by the native reader being removed in this PR.

ajjackson avatar May 05 '23 15:05 ajjackson

To complicate matters, the default handling of data with Gamma-point splits in Euphonic has changed since Euphonic version 1.0.0, using a feature added in 0.6.5. Mantid is currently pinned to version 0.6.* so we cannot guarantee that feature is available without updating the requirement, in which case we might as well update it to 1.*.

Because the Conda dependencies sit in another repository it is difficult to update requirements and code simultaneously. I think in this case the solution will be to:

  • update the code to be compatible with 0.6.* and 1.*
  • update the requirement
  • remove the compatibility code for 0.6.*

As we expect the results to change for squareicn_sum_Abins, I think the options there are either a) provide test data for both cases and detect b) provide test data for 1.* and skip the test if using 0.6.* c) drop the test data entirely and rely on tests that Mock the call to Euphonic library

ajjackson avatar May 05 '23 16:05 ajjackson

:wave: Hi, @ajjackson,

Conflicts have been detected against the base branch. Please rebase your branch against the base branch.

github-actions[bot] avatar May 10 '23 09:05 github-actions[bot]

Note that after rebasing onto #35532 (in its current state), the unit tests related to powder calculations are now passing - this branch decouples those tests from CASTEP parsing. The CASTEP loader test still has a couple of failures to address, but this is expected with the implementation change.

ajjackson avatar May 22 '23 20:05 ajjackson

:wave: Hi, @ajjackson,

Conflicts have been detected against the base branch. Please rebase your branch against the base branch.

github-actions[bot] avatar Jul 06 '23 14:07 github-actions[bot]

Running this locally I get a couple of test failures, both of which are explainable:

  • some atomic masses do not match the reference data. This is because Euphonic reads the values from the file while the legacy Mantid parser does some tweaking to match them to nearby values from Mantid. It's not clear that this is necessary and it makes testing more confusing.

  • The legacy parser chooses the wrong set of Gamma-point data from squaricn_sum_LoadCASTEP.phonon in "test_gamma_sum_correction". This is discussed in the (incorrectly closed as stale) parent Issue https://github.com/mantidproject/mantid/issues/33755

    • In this file, the Gamma point appears twice: first with a short line for "exact" data at q=[0, 0, 0], then another set of data is given with a longer specification line indicating a small perturbation away from q=[0, 0, 0].
    • The existing parser skips the first of those points and favours the second. Scientifically this is a bad choice; as discussed in Euphonic Issues such as https://github.com/pace-neutrons/Euphonic/issues/273 we get better statistics with the exact [0, 0, 0] point.
    • (Ideally users would not include the Gamma-point at all when sampling systems with LO-TO splitting, but this is beyond our control.)

ajjackson avatar Jul 31 '23 16:07 ajjackson

Looks like some trouble on MacOS with the Euphonic update

15:16:39 Encountered problems while solving: 15:16:39 - package euphonic-1.2.1-py310h4148280_0 requires llvm-openmp >=16.0.6, but none of the providers can be installed

Anyone know what's going on there?

ajjackson avatar Aug 02 '23 14:08 ajjackson

Ah I see, Mantid is currently pinned to LLVM version 15.

I don't think Euphonic actually needs version 16, it's not specified in the recipe https://github.com/conda-forge/euphonic-feedstock/blob/main/recipe/meta.yaml

but presumably this is what it built against.

Especially weird given that the conda-forge-pinning list is still on 15 https://github.com/conda-forge/conda-forge-pinning-feedstock/blob/main/recipe/conda_build_config.yaml

ajjackson avatar Aug 02 '23 14:08 ajjackson

17:39:56 Encountered problems while solving: 17:39:56 - package euphonic-1.2.1-py310h4148280_0 requires llvm-openmp >=16.0.6, but none of the providers can be installed

grr, Mac version is still using the wrong Conda build. We shouldn't be pinning exact builds, any idea how to kick it to use the version 1 build which requires openmp >= 15?

ajjackson avatar Aug 03 '23 16:08 ajjackson

Linux failure is system tests, that's probably to be expected but I'll need to look more closely and update the data if so.

The Mac situation is weird, installing from mantid-developer-osx.yml looks for the wrong build and fails.

Installing euphonic=1.2.1=py38hd2faf92_1 works.

If I then try to install from the yaml file it tries to install a Python 3.9-based euphonic build and gives up.

ajjackson avatar Aug 03 '23 17:08 ajjackson

Somehow the Mac Conda build is requiring llvm-opemp version >=15 and >=16, even though only 15 is asked for explicitly.

I have been chatting with @jaimergp of conda-forge on their Element channel; he has been very helpful and suggested a couple of things to try.

ajjackson avatar Aug 07 '23 11:08 ajjackson

OSX passing, woohoo :tada:

ajjackson avatar Aug 08 '23 11:08 ajjackson

AbinsBinWidth-updates-zoom

AbinsBinWidth-updates-fullscale

Systemtest changes, zooming in on the region with differences.

This can be rationalised: the inbuilt parser takes the data with energy splitting, the old Euphonic parser takes an average of the split and unsplit versions and the new parser takes the un-split point. This explains why the old Euphonic parser results are an equal mixture of the other two.

ajjackson avatar Aug 08 '23 15:08 ajjackson

Funny build errors, will try rebasing?

ajjackson avatar Aug 15 '23 12:08 ajjackson

Hmm, still getting funny build errors like

15:16:20 [1/909] Generating workbench resources module
15:16:20 FAILED: /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/workbench/app/resources.py 
15:16:20 cd /jenkins_workdir/workspace/pull_requests-conda-linux/build/qt/applications/workbench && /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/python3.10 -m PyQt5.pyrcc_main -o /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/workbench/app/resources.py /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/resources.qrc
15:16:20 /bin/sh: /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/python3.10: No such file or directory
15:16:20 [2/909] Generating scripts .pth file
15:16:20 [3/909] Building CXX object Framework/Parallel/CMakeFiles/EventParallelLoader.dir/src/IO/EventLoaderHelpers.cpp.o
15:16:20 [4/909] Generating mantidqt resources module
15:16:20 FAILED: /jenkins_workdir/workspace/pull_requests-conda-linux/qt/python/mantidqt/mantidqt/resources.py 
15:16:20 cd /jenkins_workdir/workspace/pull_requests-conda-linux/build/qt/python/mantidqt && /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/cmake -P create_resources.cmake
15:16:20 CMake Error at create_resources.cmake:5 (string):
15:16:20   string sub-command REPLACE requires at least four arguments.
15:16:20 
15:16:20 

The python3.10: No such file or directory is a bit worrying

ajjackson avatar Aug 15 '23 14:08 ajjackson

Hmm, still getting funny build errors like

15:16:20 [1/909] Generating workbench resources module
15:16:20 FAILED: /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/workbench/app/resources.py 
15:16:20 cd /jenkins_workdir/workspace/pull_requests-conda-linux/build/qt/applications/workbench && /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/python3.10 -m PyQt5.pyrcc_main -o /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/workbench/app/resources.py /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/resources.qrc
15:16:20 /bin/sh: /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/python3.10: No such file or directory
15:16:20 [2/909] Generating scripts .pth file
15:16:20 [3/909] Building CXX object Framework/Parallel/CMakeFiles/EventParallelLoader.dir/src/IO/EventLoaderHelpers.cpp.o
15:16:20 [4/909] Generating mantidqt resources module
15:16:20 FAILED: /jenkins_workdir/workspace/pull_requests-conda-linux/qt/python/mantidqt/mantidqt/resources.py 
15:16:20 cd /jenkins_workdir/workspace/pull_requests-conda-linux/build/qt/python/mantidqt && /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/cmake -P create_resources.cmake
15:16:20 CMake Error at create_resources.cmake:5 (string):
15:16:20   string sub-command REPLACE requires at least four arguments.
15:16:20 
15:16:20 

The python3.10: No such file or directory is a bit worrying

yes, that's what I'm scratching my head about. How does variable Python_EXECUTABLE becomes 3.10 :thinking:

jmborr avatar Aug 15 '23 15:08 jmborr

Hmm, still getting funny build errors like

15:16:20 [1/909] Generating workbench resources module
15:16:20 FAILED: /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/workbench/app/resources.py 
15:16:20 cd /jenkins_workdir/workspace/pull_requests-conda-linux/build/qt/applications/workbench && /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/python3.10 -m PyQt5.pyrcc_main -o /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/workbench/app/resources.py /jenkins_workdir/workspace/pull_requests-conda-linux/qt/applications/workbench/resources.qrc
15:16:20 /bin/sh: /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/python3.10: No such file or directory
15:16:20 [2/909] Generating scripts .pth file
15:16:20 [3/909] Building CXX object Framework/Parallel/CMakeFiles/EventParallelLoader.dir/src/IO/EventLoaderHelpers.cpp.o
15:16:20 [4/909] Generating mantidqt resources module
15:16:20 FAILED: /jenkins_workdir/workspace/pull_requests-conda-linux/qt/python/mantidqt/mantidqt/resources.py 
15:16:20 cd /jenkins_workdir/workspace/pull_requests-conda-linux/build/qt/python/mantidqt && /jenkins_workdir/workspace/pull_requests-conda-linux/mambaforge/envs/mantid-developer/bin/cmake -P create_resources.cmake
15:16:20 CMake Error at create_resources.cmake:5 (string):
15:16:20   string sub-command REPLACE requires at least four arguments.
15:16:20 
15:16:20 

The python3.10: No such file or directory is a bit worrying

yes, that's what I'm scratching my head about. How does variable Python_EXECUTABLE becomes 3.10 🤔

Not looked into this specific issue, but the Python_EXECUTABLE variable is set here (or in relative places for our other packages), to $PYTHON which I believe is the python interpreter specified in the active conda environment. https://github.com/mantidproject/mantid/blob/39b6213d434256151533c73d10c9c09a8fa39a99/conda/recipes/mantid/build.sh#L24

MialLewis avatar Aug 15 '23 15:08 MialLewis

Looking at this specific issue you're just running on a builder that has been killed by the Python 3.10 PR that has been going through, so it needs a clean rebuild.

MialLewis avatar Aug 15 '23 15:08 MialLewis

@ajjackson I think there may be some error with your release notes:

08:48:34 release/v6.8.0/indirect_geometry:4: WARNING: Bullet list ends without a blank line; unexpected unindent.
08:48:34 release/v6.8.0/indirect_geometry:6: ERROR: Unexpected indentation.

see https://builds.mantidproject.org/job/pull_requests-conda-osx/3316/consoleFull

jmborr avatar Aug 16 '23 15:08 jmborr

The build issue has to do with sphinx warnings. You need to fix your release notes

08:48:34 release/v6.8.0/indirect_geometry:4: WARNING: Bullet list ends without a blank line; unexpected unindent.
08:48:34 release/v6.8.0/indirect_geometry:6: ERROR: Unexpected indentation.

To check that you fixed them ninja docs-html

peterfpeterson avatar Aug 16 '23 15:08 peterfpeterson

Thanks for cleaning this up while I was on leave 😄

ajjackson avatar Aug 21 '23 08:08 ajjackson

no problem! 😄

jmborr avatar Aug 21 '23 10:08 jmborr