EAMxx: update EKAT submodule and adapt EAMxx
Update the EKAT submodule and change EAMxx to conform to the new version of EKAT.
A (very disruptive) PR in EKAT (not yet integrated) will break EKAT into sub-packages, to facilitate its use in other applications without the need to bring in every single ekat dependency.
This PR adapts to those changes, which can be summarized in the following points:
- There is no longer an
ekatlibrary, but a bunch ofekat::XYZlibraries, including aekat::AllLibsone (for convenience). - No longer use paths in includes: customer should just include <ekat_blah.hpp>, without any path (they should not care how ekat files are organized in the ekat repo)
ekat_kokkos_utils.hppbroken in two:ekat_reduction_utils.hppandekat_team_policy_utils.hppExeSpaceUtilsno longer persent. In its place, useTeamPolicyFactory(fromekat_team_policy_utils.hpp) for team policy creation, andReductionUtils(fromekat_reduction_utils.hpp) for reduction utilities; both are templated on exec space, just likeExeSpaceUtilswas.ScalarTraitsno longer providesinvalid()andquiet_NaN(), but only provides type info. The functionsquiet_NaN()andfinite_max()have been added inekat_math_utils.hpp, but may disappear once we have C++20 (see comment inekat_pack.hppabout "introspective" constepxr)ekat_file_utils.hpphas been purged, and all tests now simply use the standard libraryifstream/ofstreamcapabilities.- There is no more an "ekat" session. The
ekat::KokkosUtilshasinitialize_kokkos_session(and its finalize). - Minor changes related to how we print the current session configuration (some are in ekat::Core, some in ekat::KokkosUtils).
EkatCreateUnitTestExecno longer hasEXCLUDE_TEST_SESSION, but instead hasUSER_DEFINED_TEST_SESSION, for more expressivity.
The biggest challenge was how to break ekat into N packages, since some utilities were "generic enough", but required knowledge about kokkos (e.g., printing the current arch configuration seems generic enough to be in ekat::Core, but the kokkos backend info requires kokkos to be compiled).
The current result seems to be the best compromise between flexibility (for new customers) and robustness (for existing customers).
I have NOT updated mam4xx/haero to use the new ekat, so all their tests may fail to build (or even configure). I am opening the PR anyways, in the hope to get a build of all the rest until mam4xx is update too.
IMPORTANT NOTE FOR REVIEWERS: I strongly recommend to review the commits individually. I tried to group changes by topic, so reviewing one commit may be simpler; ha couple of commits are quite large, but seeing the same pattern over and over may make it easier to review.
@tcclevenger @jeff-cohere @jgfouca The deprecation of ekat::any and ekat::enable_shared_from_this is causing a vast amount of warnings. I suggest we switch to their std counterparts asap. I see three options:
- do it in this PR
- do it in a follow-up PR
- update ekat now (with current ekat master), and do the std change, THEN merge this PR.
Maybe option 3 is best, even though it ends up updating ekat twice in a short time?
1 seems easiest.
I agree with Jim. 1 would really be "asap". :-)
I agree, 1.
PR Preview Action v1.6.2 :---: |
:rocket: View preview athttps://E3SM-Project.github.io/E3SM/pr-preview/pr-7362/
|
Built to branch gh-pages at 2025-08-01 19:06 UTC.
Preview will be ready when the GitHub Pages deployment is complete.
The physics baselines tests are diffing. Before deciding whether or not to accept the diffs, I want to understand why there are diffs. Quite a lot has changed in this PR, so I need to dig a bit to find out where things drifted (could be cmake flags, implicit promotions, different checks ...).
The cuda tests seem to hang. I have to investigate.
Edit: the test that hangs is always mam4_aero_microphys_standalone. I am trying to debug via cuda-gdb, but all I was able to deduce so far is that it hangs in the 1st run call. I'm trying to bisect the exact location.
Update: the issue was in mam4xx. A fix is in the pipeline.