global-workflow
global-workflow copied to clipboard
Stop using stand-alone UPP
What new functionality do you need?
As part of the Rocky 8 upgrade for Hera (PR #2421), we had to move to a stand-alone UPP version because the one in UFS has not yet been updated. Once the UPP version in UFS is updated to include the Rocky 8 updates, we should move back to using that version instead of checking out a separate version.
What are the requirements for the new functionality?
No separate UPP submodule
Acceptance Criteria
Dependency: ufs-community/ufs-weather-model#2213
- [ ] Update UFS to hash containing UPP updates for Rocky 8
- [ ] Remove UPP submodule
- [ ] Update build_upp.sh to use
ufs_model.fd/FV3/upp/testsinstead ofupp.fd/tests - [ ] Restore linking of
ufs_model.fd/FV3/upptosorc/upp.fdinlink_workflow.sh
Suggest a solution (optional)
No response
FYI @WenMeng-NOAA
@JessicaMeixner-NOAA @WalterKolczynski-NOAA I have been preparing my UFS PR for updating upp submodule.
@WalterKolczynski-NOAA and @WenMeng-NOAA: I assume from this issue that ~HOMEgfs/sorc/upp.fd is the stand-alone UPP. Execution of sorc/build_all.sh in a working copy of develop at d6be3b5c on Hera reports a upp build failure
Running "module reset". Resetting modules to system default. The following $MODULEPATH directories have been removed: None
Building gsi_enkf, ufs, gfs_utils, gdas, ww3prepost, ufs_utils, gsi_utils, gsi_monitor, upp
Starting build_gsi_enkf.sh
Starting build_ufs.sh
Starting build_gfs_utils.sh
Starting build_gdas.sh
Starting build_ww3prepost.sh
Starting build_ufs_utils.sh
Starting build_gsi_utils.sh
Starting build_gsi_monitor.sh
Starting build_upp.sh
build_gsi_enkf.sh completed successfully!
build_gfs_utils.sh completed successfully!
build_ufs_utils.sh completed successfully!
build_gsi_utils.sh completed successfully!
build_gsi_monitor.sh completed successfully!
build_ww3prepost.sh completed successfully!
build_upp.sh failed with status 2!
build_ufs.sh completed successfully!
build_gdas.sh completed successfully!
BUILD ERROR: One or more components failed to build
Check the associated build log(s) for details.
A check of sorc/logs/build_upp.log shows
[ 88%] Building Fortran object sorc/ncep_post.fd/CMakeFiles/upp.dir/OTLIFT.f.o
[ 89%] Building Fortran object sorc/ncep_post.fd/CMakeFiles/upp.dir/SURFCE.f.o
[ 90%] Linking Fortran static library libupp.a
/usr/bin/ar: Relink `/apps/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin/libimf.so' with `/lib64/libm.so.6' for IFUNC symbol `sinf'
Error running link command: Segmentation fault
make[2]: *** [sorc/ncep_post.fd/CMakeFiles/upp.dir/build.make:2182: sorc/ncep_post.fd/libupp.a] Error 1
make[2]: *** Deleting file 'sorc/ncep_post.fd/libupp.a'
make[1]: *** [CMakeFiles/Makefile2:133: sorc/ncep_post.fd/CMakeFiles/upp.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
CI testing using C96C48_hybatmDA and C96C48_ufs_hybatmDA encounter failed jobs for gdasatmanlupp and gfsatmanlupp because $HOMEgfs/exec/upp.x does not exist. This is a soft link pointing at HOMEgfs/sorc/upp.fd/exec/upp.x.
It this failure expected?
The failure is not expected from a fresh clone. If you tried to pull in develop to an existing clone, you should've gotten a warning about it couldn't overwrite the upp.fd symlink. If that is the case, delete the symlink and then pull again (recursively or run submodule update afterwards).
Manually remove sorc/upp.fd followed by git submodule sync and git submodule update. Then manually execute ./build_upp.sh in $HOMEgfs/sorc. This worked. upp.x created. Rerun of gdasatmanlupp and gfsatmanlupp was successful.
@WalterKolczynski-NOAA @aerorahul The ufs-weather-model PR #2213 was submitted for updating upp submodule.
@WenMeng-NOAA thanks for keeping us updated. First time we updated UFS after that is merged we can remove the temporary submodule.
Manually remove
sorc/upp.fdfollowed bygit submodule syncandgit submodule update. Then manually execute./build_upp.shin$HOMEgfs/sorc. This worked.upp.xcreated. Rerun of gdasatmanlupp and gfsatmanlupp was successful.
I got the same error on Hera (Rocky8) and the manual method did not work for me.
[ 89%] Building Fortran object sorc/ncep_post.fd/CMakeFiles/upp.dir/SURFCE.f.o
[ 90%] Linking Fortran static library libupp.a
/usr/bin/ar: Relink `/apps/oneapi/compiler/2022.0.2/linux/compiler/lib/intel64_lin/libimf.so' with `/lib64/libm.so.6' for IFUNC symbol `sinf'
Error running link command: Segmentation fault
make[2]: *** [sorc/ncep_post.fd/CMakeFiles/upp.dir/build.make:2182: sorc/ncep_post.fd/libupp.a] Error 1
make[2]: *** Deleting file 'sorc/ncep_post.fd/libupp.a'
make[1]: *** [CMakeFiles/Makefile2:133: sorc/ncep_post.fd/CMakeFiles/upp.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
I started from a clean recursive clone. I tried twice but got the same error.
Here are the steps I repeat the error:
git clone --recursive https://github.com/NOAA-EMC/global-workflow
cd global-workflow/sorc
./build_upp.sh
Could this be related to any of my environment settings?
@guoqing-noaa we found there is actually an issue with the UPP hash. We added the fix into #2442, which should be merged soon.
@WalterKolczynski-NOAA My UFS PR #2213 was merged today. You may update the global-workflow accordingly to solve this issue.