ufs-weather-model icon indicating copy to clipboard operation
ufs-weather-model copied to clipboard

Test ESMF 8.6.1 beta in UFS weather model

Open junwang-noaa opened this issue 1 year ago • 4 comments

Description

The ESMF 8.6.1 beta has a fix for issue #1121, 1024 character limit is removed. ESMF team is asking the confirmation that this fix resolved the issue reported in UFS weather model.

Solution

  1. install a test version of ESMF 8.6.1 beta (https://github.com/esmf-org/esmf/releases/tag/v8.6.1b03) on hera.
  2. Run atm only test (e.g. control_p8) test for longer forecast time with output time specified in the output_fh.

Alternatives

Related to

junwang-noaa avatar Apr 08 '24 14:04 junwang-noaa

@jkbk2004 May I ask if EPIC team can install the test library on hera? Thanks

junwang-noaa avatar Apr 08 '24 14:04 junwang-noaa

@RatkoVasic-NOAA can you install https://github.com/esmf-org/esmf/releases/tag/v8.6.1b04 to spack stack 1.6.0 location? Hecules or Hera might be a good starting point for this beta test.

jkbk2004 avatar Apr 09 '24 12:04 jkbk2004

@jkbk2004 new version of esmf is still not in spack:

    # generate chksum with 'spack checksum [email protected]'
    version("8.6.0", sha256="ed057eaddb158a3cce2afc0712b49353b7038b45b29aee86180f381457c0ebe7")
    version("8.5.0", sha256="acd0b2641587007cc3ca318427f47b9cae5bfd2da8d2a16ea778f637107c29c4")
    version("8.4.2", sha256="969304efa518c7859567fa6e65efd960df2b4f6d72dbf2c3f29e39e4ab5ae594")

It cannot be installed as part of spack-stack.

RatkoVasic-NOAA avatar Apr 09 '24 15:04 RatkoVasic-NOAA

@RatkoVasic-NOAA I think you could still install with esmf@=8.6.1b04 syntax in spack side even if it is not in the package.py.

uturuncoglu avatar Apr 09 '24 22:04 uturuncoglu

@RatkoVasic-NOAA @junwang-noaa @climbfuji If you need any help about installing beta snapshot from our side, just let us know.

uturuncoglu avatar Apr 17 '24 16:04 uturuncoglu

@uturuncoglu thanks! I installed esmf-8.6.1b04 on Hercules, under spack-stack-1.6.0 It was without problems. Now we have to install MAPL with new ESMF version.

RatkoVasic-NOAA avatar Apr 17 '24 16:04 RatkoVasic-NOAA

@RatkoVasic-NOAA That is great. Thanks for the update.

uturuncoglu avatar Apr 17 '24 16:04 uturuncoglu

I'm testing the esmf 8.6.1b04 on Hercules. cpld_cpntrol_p8 fails with this error:

 57: pe=00057 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
121: pe=00121 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
  8: pe=00008 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
  8: pe=00008 FAIL at line=00956    MAPL_CapGridComp.F90                     <status=41>
139: pe=00139 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>
139: pe=00139 FAIL at line=00956    MAPL_CapGridComp.F90                     <status=41>
 48: pe=00048 FAIL at line=00510    MAPL_CapGridComp.F90                     <status=41>

DusanJovic-NOAA avatar Apr 18 '24 14:04 DusanJovic-NOAA

These are the changes I made to current develop branch:

$ git diff 
diff --git a/modulefiles/ufs_common.lua b/modulefiles/ufs_common.lua
index 1f395d97..f05ff8d4 100644
--- a/modulefiles/ufs_common.lua
+++ b/modulefiles/ufs_common.lua
@@ -10,17 +10,17 @@ local ufs_modules = {
   {["netcdf-c"]        = "4.9.2"},
   {["netcdf-fortran"]  = "4.6.0"},
   {["parallelio"]      = "2.5.10"},
-  {["esmf"]            = "8.5.0"},
+  {["esmf"]            = "8.6.1bs4"},
   {["fms"]             = "2023.02.01"},
   {["bacio"]           = "2.4.1"},
   {["crtm"]            = "2.4.0"},
   {["g2"]              = "3.4.5"},
   {["g2tmpl"]          = "1.10.2"},
   {["ip"]              = "4.3.0"},
-  {["sp"]              = "2.3.3"},
+  {["sp"]              = "2.5.0"},
   {["w3emc"]           = "2.10.0"},
   {["gftl-shared"]     = "1.6.1"},
-  {["mapl"]            = "2.40.3-esmf-8.5.0"},
+  {["mapl"]            = "2.40.3-esmf-8.6.1b04"},
   {["scotch"]          = "7.0.4"},
 }
 
diff --git a/modulefiles/ufs_hercules.intel.lua b/modulefiles/ufs_hercules.intel.lua
index 605fe579..63cfaa98 100644
--- a/modulefiles/ufs_hercules.intel.lua
+++ b/modulefiles/ufs_hercules.intel.lua
@@ -2,7 +2,7 @@ help([[
 loads UFS Model prerequisites for Hercules/Intel
 ]])
 
-prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/envs/unified-env/install/modulefiles/Core")
+prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.6.0/envs/unified-env/install/modulefiles/Core")
 
 stack_intel_ver=os.getenv("stack_intel_ver") or "2021.9.0"
 load(pathJoin("stack-intel", stack_intel_ver))

DusanJovic-NOAA avatar Apr 18 '24 14:04 DusanJovic-NOAA

@DusanJovic-NOAA can you check /work2/noaa/stmp/jongkim/stmp/jongkim/FV3_RT/rt_3264281/cpld_control_p8_intel ? intel runs ok. Setup is at /work/noaa/epic/jongkim/UFS-RT/hercules/pr-2093/modulefiles. Luckily mapl 2.40.3-esmf-8.5.0 was used ok somehow but mapl should be built with the esmf beta snapshot.

jkbk2004 avatar Apr 18 '24 14:04 jkbk2004

@DusanJovic-NOAA there is typo in your ufs_common: should be 8.6.1b04, not 8.6.1bs4

RatkoVasic-NOAA avatar Apr 18 '24 14:04 RatkoVasic-NOAA

Here is my ufs_common:

  {["jasper"]          = "2.0.32"},
  {["zlib"]            = "1.2.13"},
  {["libpng"]          = "1.6.37"},
  {["hdf5"]            = "1.14.0"},
  {["netcdf-c"]        = "4.9.2"},
  {["netcdf-fortran"]  = "4.6.0"},
  {["parallelio"]      = "2.5.10"},
  {["esmf"]            = "8.6.1b04"},
  {["fms"]             = "2023.04"},
  {["bacio"]           = "2.4.1"},
  {["crtm"]            = "2.4.0"},
  {["g2"]              = "3.4.5"},
  {["g2tmpl"]          = "1.10.2"},
  {["ip"]              = "4.3.0"},
  {["sp"]              = "2.5.0"},
  {["w3emc"]           = "2.10.0"},
  {["gftl-shared"]     = "1.6.1"},
  {["mapl"]            = "2.40.3-esmf-8.6.1b04"},
  {["scotch"]          = "7.0.4"},

RatkoVasic-NOAA avatar Apr 18 '24 14:04 RatkoVasic-NOAA

Thanks. I fixed the typo and I'm rerunning the test. But correct modules were loaded despite the typo, which is weird.

DusanJovic-NOAA avatar Apr 18 '24 14:04 DusanJovic-NOAA

@DusanJovic-NOAA , NOTE , mapl in this configuration is compiled with [email protected] (although name suggests differently). This was just test if [email protected] is working. [email protected] didn't compile with new esmf. I'm looking into this with Matt. @jkbk2004 ran tests on Hercules successfully, check with him if he ran same tests as you are running.

RatkoVasic-NOAA avatar Apr 18 '24 14:04 RatkoVasic-NOAA

cpld_control_p8 is still crashing.

DusanJovic-NOAA avatar Apr 18 '24 15:04 DusanJovic-NOAA

@DusanJovic-NOAA What is difference between your and Jong's run?

RatkoVasic-NOAA avatar Apr 18 '24 15:04 RatkoVasic-NOAA

I don't know. My working directory is here: /work/noaa/fv3-cam/djovic/ufs/e861/ufs-weather-model.

Maybe the fact that @jkbk2004 used mapl 2.40.3-esmf-8.5.0? Not mapl 2.40.3-esmf-8.6.1b04

DusanJovic-NOAA avatar Apr 18 '24 15:04 DusanJovic-NOAA

You can try with this ufs_common:

  {["jasper"]          = "2.0.32"},
  {["zlib"]            = "1.2.13"},
  {["libpng"]          = "1.6.37"},
  {["hdf5"]            = "1.14.0"},
  {["netcdf-c"]        = "4.9.2"},
  {["netcdf-fortran"]  = "4.6.1"},
  {["parallelio"]      = "2.5.10"},
  {["esmf"]            = "8.6.1b04"},
  {["fms"]             = "2023.04"},
  {["bacio"]           = "2.4.1"},
  {["crtm"]            = "2.4.0"},
  {["g2"]              = "3.4.5"},
  {["g2tmpl"]          = "1.10.2"},
  {["ip"]              = "4.3.0"},
  {["sp"]              = "2.5.0"},
  {["w3emc"]           = "2.10.0"},
  {["gftl-shared"]     = "1.6.1"},
  {["mapl"]            = "2.40.3-esmf-8.5.0"},
  {["scotch"]          = "7.0.4"},

Also:

hercules: /work/noaa/epic/jongkim/UFS-RT/hercules/pr-2093/tests> git remote -v
origin  https://github.com/RatkoVasic-NOAA/ufs-weather-model (fetch)
origin  https://github.com/RatkoVasic-NOAA/ufs-weather-model (push)
hercules: /work/noaa/epic/jongkim/UFS-RT/hercules/pr-2093/tests> git branch
* ss-160

RatkoVasic-NOAA avatar Apr 18 '24 15:04 RatkoVasic-NOAA

I can, but that will load esmf-8.5.0 not 8.6.1. And we should test 8.6.1, we know 8.5.0 works fine.

DusanJovic-NOAA avatar Apr 18 '24 15:04 DusanJovic-NOAA

What's the issue with compiling mapl using esmf 8.6.1?

DusanJovic-NOAA avatar Apr 18 '24 15:04 DusanJovic-NOAA

Isn't ESMF 8.6.x backward compatible with the previous release, ESMF 8.5.0?

DusanJovic-NOAA avatar Apr 18 '24 15:04 DusanJovic-NOAA

I think the ESMF target definition has changed, which requires updates to the mapl cmake config (lowercase vs uppercase or something like that).

climbfuji avatar Apr 18 '24 15:04 climbfuji

If update in the ESMF requires update in the MAPL then we should also update it. We should not be compiling 2.40.3 if it doesn't work with the latest ESMF.

DusanJovic-NOAA avatar Apr 18 '24 15:04 DusanJovic-NOAA

If update in the ESMF requires update in the MAPL then we should also update it. We should not be compiling 2.40.3 if it doesn't work with the latest ESMF.

We are waiting for that tag (and they were hoping to wait for an official release of esmf as opposed to a beta snapshot ...)

I'll ping our NASA colleagues and ask them to create the tag. Will let you know when I hear back.

climbfuji avatar Apr 18 '24 15:04 climbfuji

Hmm. MAPL people are waiting on the official ESMF release. And ESMF people are waiting on us to test a beta snapshot before they make a release, and we are waiting on MAPL tag, that is waiting on ESMF. circulus vitiosus.

DusanJovic-NOAA avatar Apr 18 '24 16:04 DusanJovic-NOAA

Hmm. MAPL people are waiting on the official ESMF release. And ESMF people are waiting on us to test a beta snapshot before they make a release, and we are waiting on MAPL tag, that is waiting on ESMF. circulus vitiosus.

GMAO says the tag will be ready next week

climbfuji avatar Apr 18 '24 16:04 climbfuji

I hope I can make a new release of MAPL next week. But because of the requirement for ESMF 8.6.1 (beta or not), I'll need to build new libraries on all our clusters, etc. so that our devs don't have issues building with it (since MAPL will require ESMF 8.6.1).

Now, if you are wanting to test, you could try out this commit https://github.com/GEOS-ESM/MAPL/commit/5f91a5c733eda8cd8d385c108b71b1f41b966c72. This is my current draft PR (see https://github.com/GEOS-ESM/MAPL/pull/2682). This is where I'm tracking the changes to MAPL.


You can also see my spack testing changes here: https://github.com/spack/spack/compare/develop...mathomp4:spack:feature/mathomp4/test-mapl-build

You'll note it says MAPL v5 only because I wanted to make sure I was "safe" when doing the testing. And you'll note the commit for v5 is different since this was a couple weeks ago.

mathomp4 avatar Apr 18 '24 16:04 mathomp4

I ran the control_p8 test with esmf 8.6.1b04 for 240 hours creating outputs every hour. output_fh line for that configuration is longer than 1024 characters, which means version 8.6.1 will fix #1121.

DusanJovic-NOAA avatar Apr 19 '24 16:04 DusanJovic-NOAA

@uturuncoglu @danrosen25 The ESMF 8.6.1b04 testing is done in UFS and the fix works. Thanks.

junwang-noaa avatar Apr 26 '24 14:04 junwang-noaa

Note: We are close to getting ESMF 8.6.1b04 in GEOS land. I encountered a fun bug with GCC 13 that we had to figure out today, but hopefully next week I can release MAPL 2.46 and all is well.

mathomp4 avatar Apr 26 '24 17:04 mathomp4