ufs-weather-model icon indicating copy to clipboard operation
ufs-weather-model copied to clipboard

aerosol fields do not reproduce when fhmax=4,fhzero=2

Open DeniseWorthen opened this issue 3 years ago • 34 comments

Description

To reduce the time required by the updated cpld_bmark_p8 test with the mesh cap for PR https://github.com/ufs-community/ufs-weather-model/pull/1131, I've tried to reduce fhmax to 4 and restart the model from hour 2.

All files reproduce except for the atmf004.tile[1-6].nc and fv_tracer.res.tile[1-6].nc restart files. These files differ only in the following fields: nh3, nh4a, no3an2, no3an2, no3an3, pm25, pm10.

To Reproduce:

A test branch using the current cpld_control_c96_p8 modified to run for fhmax=4 is here: branch. This test produces same field differences as those in the updated cpld_bmark_p8 test.

The control and restart cases in the test branch can be run using the oRT command:

./opnReqTest -n cpld_control_c96_p8 -c rst -ek

This will use ecflow and keep the run directory.

DeniseWorthen avatar Apr 25 '22 23:04 DeniseWorthen

@weiyuan-jiang May I ask if there is any restriction on the restart intervals for the species of nh and no3?

junwang-noaa avatar Apr 27 '22 12:04 junwang-noaa

@DeniseWorthen Does this include the updated compiler flags. I can't find that issue/pr right now but I believe that @rmontuoro had fixed this issue.

bbakernoaa avatar Apr 27 '22 14:04 bbakernoaa

I can get restart reproducibility with the current configuration which uses fhmax in intervals of 6 (depending on the test). It is when reducing the fhmax to 4 (and fhzero to either 1 or 2) that the aerosol fields are not reproducing.

DeniseWorthen avatar Apr 27 '22 14:04 DeniseWorthen

If you use Dusan's PR: https://github.com/ufs-community/ufs-weather-model/pull/1171 does it help? I believe that's the issue/fix Barry is referring to.

JessicaMeixner-NOAA avatar Apr 27 '22 16:04 JessicaMeixner-NOAA

@weiyuan-jiang May I ask if there is any restriction on the restart intervals for the species of nh and no3?

Sorry I cannot answer the question. But I can ask around for you

weiyuan-jiang avatar Apr 27 '22 16:04 weiyuan-jiang

Thanks @JessicaMeixner-NOAA, I understood which fix Barry was referring to.

I can test Dusan's compile options. However, since aerosols reproduce using fhmax=6,fhzero=6, I would be surprised if that explains why it is not reproducing at fhmax=4, fhzero=2.

DeniseWorthen avatar Apr 27 '22 19:04 DeniseWorthen

I tested using oRT after merging Dusan's release_flags branch and obtained the same non-reproducing aerosol fields.

DeniseWorthen avatar Apr 27 '22 21:04 DeniseWorthen

I've updated the test branch to try a 3/1/4 restart test. The oRT enforces the restart time at FHMAX/2 so testing of the 3/1/4 cannot be done w/ the oRT. Also, because of Issue MOM6 Issue #90, comparison of MOM6 restarts will need to be removed if otherwise the 3/1/4 test reproduces.

DeniseWorthen avatar Apr 29 '22 20:04 DeniseWorthen

Query for someone in GEOS-land, do you have a "descriptive" explanation for the variables in play here? I've never run UFS so I'm a bit in the dark. 😄 I'm sort of guessing they are like our DT (time steps)?

mathomp4 avatar May 02 '22 12:05 mathomp4

Are the restart files the only input files? Are there any Extdata in the tests? @junwang-noaa

weiyuan-jiang avatar May 02 '22 12:05 weiyuan-jiang

@mathomp4 The test case is a C96 global forecast coupled case. The time step for atmosphere is 720s, it does not change in the control (fh0->4hr from a cold start) and the restart test(fh0->2 cold start, then fh2->4 with restrart). In the restart test, the forecast restarts from current time at fh=2 using the restart files and continue to run 2 hrs to get fh=4hr.

junwang-noaa avatar May 02 '22 12:05 junwang-noaa

@mathomp4 The fhmax is the forecast length. In this case, we are running the model forward 4 hours and writing a restart for the components at hour 2. Using the restarts at hour2, the model is run from hour=2 to hour=4. What I comparing are the FV3 tracer restart files and the model forecast files between the initial (hr 0:4) and the restart run (2:4).

I can get the aerosol fields to reproduce if I do the same test using a restart at hour 3. In this case I'm still running the model 4 hours but I'm using a restart from hour 3 to restart to run the final 1 hour.

fhzero is the interval when accumulated fields are re-zeroed. I've actually tested w/ both fhzero=1 and 2, so I think it is not really a fhzero issue.

DeniseWorthen avatar May 02 '22 12:05 DeniseWorthen

I think fhzero should not be lower than fhout.

On Mon, May 2, 2022 at 8:44 AM Denise Worthen @.***> wrote:

@mathomp4 https://github.com/mathomp4 The fhmax is the forecast length. In this case, we are running the model forward 4 hours and writing a restart for the components at hour 2. Using the restarts at hour2, the model is run from hour=2 to hour=4. What I comparing are the FV3 tracer restart files and the model forecast files between the initial (hr 0:4) and the restart run (2:4).

I can get the aerosol fields to reproduce if I do the same test using a restart at hour 3. In this case I'm still running the model 4 hours but I'm using a restart from hour 3 to restart to run the final 1 hour.

fhzero is the interval when accumulated fields are re-zeroed. I've actually tested w/ both fhzero=1 and 2, so I think it is not really a fhzero issue.

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1190#issuecomment-1114804392, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYVJT5GP5U3AKR4DYFDVH7E3JANCNFSM5UKDGOPA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718

SMoorthi-emc avatar May 02 '22 12:05 SMoorthi-emc

Thanks @SMoorthi-emc. I think I did have fhout set to either 2 (for fhzero=2) or 1 (for fhzero=1) but I will recheck.

@weiyuan-jiang I'm not sure how to answer your question. I have a run directory on hera here

/scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_OPNREQ_TEST/opnReqTest_14673/cpld_control_c96_p8_std_base

DeniseWorthen avatar May 02 '22 12:05 DeniseWorthen

Okay. I can confirm this on the GEOS end it seems. I ran a start-stop run of 4 hours vs 2+2 and I'm getting restart failures as well. I guess my nightly tests never picked up on this because my 'default' regression start-stop test is 24 vs 18+6...and there's a lot of 3s in that.

I've pinged @bena-nasa about this as well as @weiyuan-jiang and @tclune from our group knowing this.

mathomp4 avatar May 02 '22 14:05 mathomp4

Hi All,
there appears to be a hard coded 3 hourly frequency here https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/NI2G_GridComp/NI2G_GridCompMod.F90#L393 and here: https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/SU2G_GridComp/SU2G_GridCompMod.F90#L480

in gocart2g

If I changed this to a 2 hour frequency then a run of 4 hours vs 2 + 2 passes our start-stop regress. So this just seems suspicious and could explain why something involving 2 hours is misbehaving (just speculation for UFS since I can't test but certainly explains why our own regression failed in the run length was not a multiple of 3 hours). Seems like this needs to be an even interval of the run segment length or perhaps something needs to be saved in a checkpoint that is not happening and the logic for this needs to be tightened. I'll open an issue in the gocart repository.

bena-nasa avatar May 02 '22 14:05 bena-nasa

Here is Arlindo's comments. Quote: " There was a reason why the 3 hour alarm was hardwired, as not to give the user the illusion that they could specify any other value. An easier solution may involve changing the way we handle these oxidants. The way this oxidant is "recycled" always apperead contrived in my opinion. So, stop trying to find a way to address this in code. There is no deep mandate to keep this algorithm. Let us discuss this in our aerosol group meeting."

weiyuan-jiang avatar May 02 '22 18:05 weiyuan-jiang

@weiyuan-jiang when did he make that comment?

tclune avatar May 02 '22 18:05 tclune

https://github.com/GEOS-ESM/GOCART/issues/146#issuecomment-1115201582

bena-nasa avatar May 02 '22 19:05 bena-nasa

@bena-nasa @weiyuan-jiang May I ask if there is any update on this issue? Thanks

junwang-noaa avatar May 09 '22 13:05 junwang-noaa

I am not aware of any update on this issue. @junwang-noaa

weiyuan-jiang avatar May 09 '22 14:05 weiyuan-jiang

@junwang-noaa Our best thought is that the issue is this 3 hourly frequency hard coded in gocart (see the issue linked above in the gocart repo). I think the issue is two-fold, the alarm needs to be created with a fixed reference time and an extra field needs to be in the checkpoint file. Unfortunately I was having some misbehaviour with the ESMF alarms when I tried to fix this. In that issue Arlindo commented that perhaps that algorithm itself needs changed altogether but I have not heard anything more on that. I was on vacation the last several days. I can give a 2nd look at fixing the current algorithm as is, maybe my first attempt I did something wrong.

bena-nasa avatar May 10 '22 13:05 bena-nasa

A related issue #1207 was created to allow model to restart at fh=3hr and write out restart files at the end of forecast time fh=4.

junwang-noaa avatar Jun 06 '22 14:06 junwang-noaa

I am curious how the PR#1171 is related to this restart reproducibility as we currently have the restart reproducibility when using the 3hr restart interval.

On Wed, Apr 27, 2022 at 12:32 PM Jessica Meixner @.***> wrote:

If you use Dusan's PR: #1171 https://github.com/ufs-community/ufs-weather-model/pull/1171 does it help? I believe that's the issue/fix Barry is referring to.

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1190#issuecomment-1111211706, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TJZWR5XL6P5SESSDDDVHFTX5ANCNFSM5UKDGOPA . You are receiving this because you commented.Message ID: @.***>

junwang-noaa avatar Oct 11 '22 07:10 junwang-noaa

Since 3 hourly frequency hard coded in gocart is hardcoded. Some code changes are required in GOCART side to allow this capability. I will close the issue at this time.

junwang-noaa avatar Jul 03 '23 14:07 junwang-noaa

Since 3 hourly frequency hard coded in gocart is hardcoded. Some code changes are required in GOCART side to allow this capability. I will close the issue at this time.

@junwang-noaa I think this was fixed by @bena-nasa in https://github.com/GEOS-ESM/GOCART/pull/224 (or at least partially)? This PR got into GOCART v2.2.0

mathomp4 avatar Jul 03 '23 15:07 mathomp4

@mathomp4 That is great! Currently we have a PR with GOCART pointing to develop branch on 5/4 ("Ensure GOCART2G can run without the NI component"). Do we need to make additional changes in GOCART configurations when switching to GICART v2.2.0?

junwang-noaa avatar Jul 03 '23 15:07 junwang-noaa

@junwang-noaa what hash are you pointing to? I can look and what's different.

Also, I suppose I'd say use v2.2.1 as that has a bug fix on 2.2.0.

mathomp4 avatar Jul 03 '23 15:07 mathomp4

It is this version.

junwang-noaa avatar Jul 03 '23 15:07 junwang-noaa

Okay. So v2.1.4 essentially. I think you should be able to go to v2.2.1 without any big issues that I can see (famous last words).

mathomp4 avatar Jul 03 '23 15:07 mathomp4