ufs-weather-model icon indicating copy to clipboard operation
ufs-weather-model copied to clipboard

bug fixes: kchunk3d ignored, hailwat uninitialized in dycore, tile_num wrong for nests

Open SamuelTrahanNOAA opened this issue 11 months ago • 9 comments

Commit Queue Requirements:

  • [x] Fill out all sections of this template.
  • [x] All sub component pull requests have been reviewed by their code managers.
  • [x] Run the full Intel+GNU RT suite (compared to current baselines) on either Hera/Derecho/Hercules
  • [x] Commit 'test_changes.list' from previous step

Description:

Fixes these bugs:

  1. Fix from @DusanJovic-NOAA wherein the kchunk3d setting in model_configure was ignored. This caused an abort due to a negative index in an MPI call on some platforms. This my have been due to a 32-bit integer wraparound, but we cannot confirm that.
  2. A hailwat variable was uninitialized in the FV3 dynamical core. Now it is set to the hailwat tracer index.
  3. The tile_num sent to CCPP in FV3 was wrong for the nest because it was the index of the tile in the mosaic (index 1) instead of the "global tile number" (index 7). This is corrected by having the dynamical core pass the "global tile number" up to the model.

No answers should change.

Commit Message:

* UFSWM - 
  * FV3 - correct handing of kchunk3d and use the right tile number in CCPP
    * atmos_cubed_sphere - initialize the hailwat variable and pass global_tile up to model

Priority:

  • Critical

Git Tracking

UFSWM:

Issues:

  • UFSWM: fixes https://github.com/ufs-community/ufs-weather-model/issues/2209
  • UFSWM: fixes https://github.com/ufs-community/ufs-weather-model/issues/2227
    • FV3: https://github.com/NOAA-EMC/fv3atm/issues/797
      • GFDL_atmos_cubed_sphere:
        • https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/issues/328
        • https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/issues/329

Note: Although #2227 is an issue in this repository, the bug is in FV3.

Sub component Pull Requests:

  • FV3: https://github.com/NOAA-EMC/fv3atm/pull/806
    • atmos_cubed_sphere: https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/pull/331

UFSWM Blocking Dependencies:


Changes

Regression Test Changes (Please commit test_changes.list):

  • No Baseline Changes.

Input data Changes:

  • None.

Library Changes/Upgrades:

  • No Updates

Testing Log:

  • RDHPCS
    • [ ] Hera
    • [ ] Orion
    • [ ] Hercules
    • [ ] Jet
    • [ ] Gaea
    • [ ] Derecho
  • WCOSS2
    • [ ] Dogwood/Cactus
    • [ ] Acorn
  • [ ] CI
  • [ ] opnReqTest (complete task if unnecessary)

SamuelTrahanNOAA avatar Mar 21 '24 18:03 SamuelTrahanNOAA

@SamuelTrahanNOAA EPIC wants to go with this PR next. Could you run the full suite on Hera and commit the test_changes.list please?

BrianCurtis-NOAA avatar Apr 11 '24 14:04 BrianCurtis-NOAA

I am rerunning regression tests now. 259 of 299 tests have completed and none have failed. I disabled job resubmission, so this means the tests are passing on the first try.

SamuelTrahanNOAA avatar Apr 11 '24 14:04 SamuelTrahanNOAA

Could someone please request reviews from these individuals?

@DusanJovic-NOAA @zhanglikate @kayeekayee @spanNOAA @ChristianBoyer-NOAA

They have been involved in testing the fix for the critical kchunk3d bug

SamuelTrahanNOAA avatar Apr 11 '24 14:04 SamuelTrahanNOAA

Could someone please request reviews from these individuals?

@DusanJovic-NOAA @zhanglikate @kayeekayee @spanNOAA @ChristianBoyer-NOAA

They have been involved in testing the fix for the critical kchunk3d bug

Only Dusan seems to be allowed as a requested reviewer, but the others can still give a review i believe

BrianCurtis-NOAA avatar Apr 11 '24 14:04 BrianCurtis-NOAA

Regression tests passed. No baseline changes.

EDIT: Regression tests passed on Hera. I didn't run them anywhere else.

SamuelTrahanNOAA avatar Apr 11 '24 15:04 SamuelTrahanNOAA

I've merged develop. Those changes were all CICE, so they should not affect this PR's changes nor the bug people are encountering. Hence, I am not rerunning regression tests unless someone asks me to do that. Code managers will run regression tests in the ordinary testing process.

SamuelTrahanNOAA avatar Apr 11 '24 17:04 SamuelTrahanNOAA

I've merged develop. Those changes were all CICE, so they should not affect this PR's changes nor the bug people are encountering. Hence, I am not rerunning regression tests unless someone asks me to do that. Code managers will run regression tests in the ordinary testing process.

So everyone knows, the intention for the full RT suite being run is not to be re-done unless code changes related to the bug/feature being added/fixed are made. So merging with develop is not in that category and does not need to be rerun.

BrianCurtis-NOAA avatar Apr 11 '24 17:04 BrianCurtis-NOAA

I've retested my nested global case inside the global-workflow and it has passed the failure point. (I had already tested it outside the workflow.) This triad of fixes still work for me. I look forward to seeing them in the develop branch.

SamuelTrahanNOAA avatar Apr 11 '24 18:04 SamuelTrahanNOAA

We are going to start working on this pr today. @FernandoAndrade-NOAA @BrianCurtis-NOAA FYI

jkbk2004 avatar Apr 11 '24 19:04 jkbk2004

Jet hasn't finished. Did something go wrong over there?

I've had lots of little technical issues while running on Jet since the Rocky upgrade.

SamuelTrahanNOAA avatar Apr 12 '24 13:04 SamuelTrahanNOAA

Jet hasn't finished. Did something go wrong over there?

I've had lots of little technical issues while running on Jet since the Rocky upgrade.

It was just a little slow yesterday, it looks like it passed, I'll push it up shortly.

FernandoAndrade-NOAA avatar Apr 12 '24 15:04 FernandoAndrade-NOAA

We can proceed with the merging process. I'll follow up on the cubed-sphere

zach1221 avatar Apr 12 '24 18:04 zach1221

The cubed-sphere PR has been merged. I updated the FV3 PR to point to the authoritative .gitmodules and cubed sphere.

You can proceed to merging the FV3 PR.

SamuelTrahanNOAA avatar Apr 13 '24 00:04 SamuelTrahanNOAA

@SamuelTrahanNOAA FV3 merged.. hash: https://github.com/NOAA-EMC/fv3atm/commit/37e7d4859db4eb75472091abc650831060037715

BrianCurtis-NOAA avatar Apr 14 '24 17:04 BrianCurtis-NOAA

I have reverted .gitmodules and pointed FV3 to the head of the authoritative develop branch.

This PR is ready for final review and merge.

SamuelTrahanNOAA avatar Apr 14 '24 17:04 SamuelTrahanNOAA