global-workflow
global-workflow copied to clipboard
add capability to dynamically generate GSI *info files
Description
Currently config.anal has an if/then/else loop that sets env vars SATINFO, OZINFO, CONVINFO that contain paths to different gsi info files for different periods over the last several years. For reanalysis, we need a solution that works back to 1979. Rather than add to the if/then/else block in config.anal, I have created a set of scripts that generate *info files dynamically given a date (https://github.com/NOAA-PSL/build_gsinfo). What is needed is a way to use those scripts in global-workflow. Note this is only needed for GSI - for JEDI the needed functionality is included in observation_chronicle (https://github.com/NOAA-EMC/jcb-gdas/tree/develop/observation_chronicle/atmosphere).
Resolves https://github.com/NOAA-EMC/global-workflow/issues/3293
Requires https://github.com/NOAA-EMC/GSI-fix/pull/28
Enabled via USE_BUILD_GSINFO env var that can be set to YES in config.base (default is NO). 3 new scripts added (create_satinfo.sh, create_ozinfo.sh, create_convinfo.sh). These scripts generate the GSI *info for a given analysis date using data from build_gsinfo (which will live inside GSI-fix once https://github.com/NOAA-EMC/GSI-fix/pull/28 is merged).
The OBS_INPUT table in the GSI namelist is removed from exglobal_atmos_analysis.sh to allow for separate options for NCEP ops and reanalysis. Both versions are included as text files in build_gsinfo/obs_input. The OBS_INPUT env var can be used to choose which version do use. The NCEP ops version is the default in config.anal.
A workaround for https://github.com/NOAA-EMC/GSI/issues/752 is included in exglobal_atmos_analysis.sh (pointing to a separate directory for HIRS coefficient files). This hack can be removed once https://github.com/NOAA-EMC/GSI/issues/783 is merged.
Type of change
- [ ] Bug fix (fixes something broken)
- [x] New feature (adds functionality)
- [ ] Maintenance (code refactor, clean-up, new CI test, etc.)
Change characteristics
-
Is this a breaking change (a change in existing functionality)?NO
-
Does this change require a documentation update? YES
-
Does this change require an update to any of the following submodules? YES
- [ x] GSI-fix https://github.com/NOAA-EMC/GSI-fix/pull/28
How has this been tested?
- Clone and build on gaeac6 and hera
- Cycled test on gaeac6 and hera
Checklist
- [ ] Any dependent changes have been merged and published
- [ x] My code follows the style guidelines of this project
- [ x] I have performed a self-review of my own code
- [x ] I have commented my code, particularly in hard-to-understand areas
- [ x] I have documented my code, including function, input, and output descriptions
- [ x] My changes generate no new warnings
- [ x] New and existing tests pass with my changes
- [ ] This change is covered by an existing CI test or a new one has been added
- [ ] Any new scripts have been added to the .github/CODEOWNERS file with owners
- [ ] I have made corresponding changes to the system documentation if necessary
@jack-woollen @jswhit2 Jeff I didn't figure out how to update your branch with the convinfo file containing complete satwnd definitions. Instead you can copy from /work2/noaa/da/jwoollen/RAEXPS/scripts/2021ozn1/build_gsinfo/convinfo/merged_convinfo.txt.
@ClaraDraper-NOAA I think you forgot to include the link. Would you be able to tell us specifically which lines to add to the. reanalysis version of convinfo (https://github.com/NOAA-PSL/build_gsinfo/blob/main/convinfo/merged_convinfo.txt)? The anavinfo changes won't be needed for the scout runs until we start running ensemble DA.
@jswhit2 To include the soil analysis in the reanalysis / scout runs, we'll need to make changes to the convinfo and anavinfo. For the NRT system, those changes are here.
Can you please add these changes into your [conv/anav]info files? I'm not sure if it's better to include them always, or as an option.
I did forget the link! It's here
If we want to assimilate 2m obs into the scout run (maybe not a great idea???), to see the changes you need, in fix/gsi/
diff global_convinfo_2mObs.txt global_convinfo.txt
If we just want to monitor them, you can leave convinfo as is. For monitoring or assim, we need to add t2m and q2m to the met_guess and state_derivative namelists in anavinfo. There's an example here on hera:
/scratch2/BMC/gsienkf/Clara.Draper/gerrit-hera/global-workflow_CNTRL/fix/global_anavinfo_2mDiag.l127.txt
OK got it - thanks @ClaraDraper-NOAA
Moving to draft while upstream issues are addressed. @jswhit feel free to re-mark this as ready for review when it is so.
A few minor nit-picks. There are also a number of shell-check issues noted here: https://github.com/NOAA-EMC/global-workflow/actions/runs/14601534590/job/40960623756?pr=3472. Could you please also address these?
@DavidHuber-NOAA those shell-check errors do not appear to coming from lines I modified so I hesitate to change them.
A few minor nit-picks. There are also a number of shell-check issues noted here: https://github.com/NOAA-EMC/global-workflow/actions/runs/14601534590/job/40960623756?pr=3472. Could you please also address these?
@DavidHuber-NOAA those shell-check errors do not appear to coming from lines I modified so I hesitate to change them.
@jswhit2 I have addressed the shellcheck issues in exglobal_atmos_analysis.sh and exglobal_diag.sh.
To reproduce the gfsv17_historical GSI *info files, the build_gsinfo dir in GSI-fix will need to be updated to match https://github.com/NOAA-PSL/build_gsinfo-fix/pull/3. It will not reproduce them bit for bit, since I've turned off some instruments that no longer exist, but are still turned on in satinfo (like amsua_metop-a and amsua_aqua).
I think all the requested changes have been made and this PR is ready to go once @RussTreadon-NOAA and @ClaraDraper-NOAA finish their reviews.
@DavidHuber-NOAA curious about why you made this change. Seems like the correct place to look is ${FIXgsi}/build_gsinfo.
@jswhit that change goes along with this one: https://github.com/jswhit2/global-workflow/pull/2/commits/8167654653d5f97a8bb625840208e87badae4f66. The latter creates links from the build_gsinfo contents (satinfo, ozinfo, etc) to parm/gsinfo (I made some additional commits after this to correct some bugs in the linking; you can see the full, correct list here).
The reasons for this change set is that we can link directly to the gsinfo files in the build_gsinfo submodule and thus not have to stage new GSI fix file sets on all of the platforms. Since the build_gsinfo submodule is version controlled, I wanted to avoid staging it.
@jswhit2 @ClaraDraper-NOAA I ran a test case that generates the 2m observation info files along with the other GSI *info files for a C96 analysis and for cycle 2025090100. The run directory for the analysis can be found here: /scratch4/NCEPDEV/stmp/David.Huber/RUNDIRS/gsinfo/gdas.2025090100/anal.2297506
Can you take a look at this directory and verify that everything looks as it should (and/or ping others to take a look)? I would like to start the process of merging upstream submodule PRs so we can move this PR forward.
The expdir for the experiment can be found here: /scratch3/NCEPDEV/stmp/David.Huber/para_gsinfo/expdir/gsinfo
It is worth noting that the gdas_analdiag job failed for this case. I have not investigated the cause of the failure yet, but it suggests that the diag jobs will need to be reworked some to support dynamic GSI *info files. I'm not sure if this needs to happen for this PR or if this can happen in a follow-up. What do you think?
It is worth noting that the
gdas_analdiagjob failed for this case. I have not investigated the cause of the failure yet, but it suggests that thediagjobs will need to be reworked some to support dynamic GSI *info files. I'm not sure if this needs to happen for this PR or if this can happen in a follow-up. What do you think?
I don't think we can let this get merged in if it breaks analdiag
@CoryMartin-NOAA just clarifying that this only breaks ‘analdiag’ if the ‘build_gsinfo’ is enabled, which it is not by default. I’m fine with continuing to work on this issue if that’s still a blocking issue.
@DavidHuber-NOAA got it, I misunderstood. As long as it passes with the default/old configuration, that's fine with me
@jswhit that change goes along with this one: jswhit2@8167654. The latter creates links from the
build_gsinfocontents (satinfo,ozinfo, etc) toparm/gsinfo(I made some additional commits after this to correct some bugs in the linking; you can see the full, correct list here).The reasons for this change set is that we can link directly to the
gsinfofiles in thebuild_gsinfosubmodule and thus not have to stage new GSI fix file sets on all of the platforms. Since thebuild_gsinfosubmodule is version controlled, I wanted to avoid staging it.
OK that makes sense. Thanks @DavidHuber-NOAA
@jswhit2 @ClaraDraper-NOAA I ran a test case that generates the 2m observation info files along with the other GSI *info files for a C96 analysis and for cycle 2025090100. The run directory for the analysis can be found here:
/scratch4/NCEPDEV/stmp/David.Huber/RUNDIRS/gsinfo/gdas.2025090100/anal.2297506Can you take a look at this directory and verify that everything looks as it should (and/or ping others to take a look)? I would like to start the process of merging upstream submodule PRs so we can move this PR forward.
The
expdirfor the experiment can be found here:/scratch3/NCEPDEV/stmp/David.Huber/para_gsinfo/expdir/gsinfoIt is worth noting that the
gdas_analdiagjob failed for this case. I have not investigated the cause of the failure yet, but it suggests that thediagjobs will need to be reworked some to support dynamic GSI *info files. I'm not sure if this needs to happen for this PR or if this can happen in a follow-up. What do you think?
Unfortunately, I don't have RDHPCS access yet (after my recent re-hire) so I can't see those files. I do still have access to orion, so if you can copy the files there (including the log from the failed analdiag step) I can take a look.
@jswhit2 sure, I've copied over the expdir, comroot, and anal.2297506 directories to Orion here: /work2/noaa/global/dhuber/gsinfo.
the analdiag step fails because a diag_tcp file is not created. This should be fixed by https://github.com/jswhit2/global-workflow/commit/c3518bb392a4ca73c24c156fa54d821dc35decfd
Checked the satinfo, ozinfo and convinfo files in @DavidHuber-NOAA's test and compared to the files in gfsv17_historical. Aside from one error in avhrr3_metop-c (which is now fixed in build_gsinfo-fix the files are functionally equivalent (but not identical). The only differences are for instruments that no longer exist, or are turned off.
@jswhit thanks for the fixes! That did get the gdas_analdiag job to run and the rest of the cycle ran successfully as well.
@jswhit2 @DavidHuber-NOAA - Is the GSI hash in this branch correct? I'm not able to clone/build:
fatal: Fetched in submodule path 'gsi_enkf.fd', but it did not contain 1460f419c700dadb706c3c85f2cbe71cc280aab6. Direct fetching of that commit failed.
Github is also a bit confused about the hash.
@CatherineThomas-NOAA the branch lives in my fork. I will update the .gitmodules file to tell Git/GitHub where to find it.
@CatherineThomas-NOAA It should be clonable now. I had a couple of local commits on Ursa that weren't pushed. I updated the .gitmodules file as well for good measure.
@DavidHuber-NOAA - I was able to clone and run successfully on GaeaC6 after the change. Thanks!
@DavidHuber-NOAA @jswhit GSI-fix PR https://github.com/NOAA-EMC/GSI-fix/pull/50 has been successfully merged to its repo. Do you need someone else to update the hash in the GSI as well? Are there other changes in the GSI hash associated with this PR or is it just the GSI-fix hash update?
I think it's just the GSI-fix hash in GSI that needs updating. @DavidHuber-NOAA could you update the hash in your fork and issue a PR? Once that is merged I can update the GSI hash here and this PR should be ready to merge.
The associated GSI PR https://github.com/NOAA-EMC/GSI/pull/945 has been merged. Looks like from here we need to:
- Update the GSI hash
- Update the glopara staged GSI-fix directory (with new date)
- Update gsi fix version date
- Turn on default behavior to build_gsinfo in gfs/config.base
Did I miss anything?