Negative runoff quick-fix
A quick-fix to eliminate the negative runoff sent from ROF to OCN. Activated by setting redirect_negative_qgwl = .true. in user_nl_mosart. Two scenarios considered:
Scenario A (net_global_qgwl ≥ 0):
- Proportionally scales down positive qgwl cells
- Zeros out negative qgwl cells
- No outlet redistribution
Scenario B (net_global_qgwl < 0):
- Zeros out all qgwl
- Redistributes deficit to all outlets proportionally
@hydrotian Please can you provide a location of the coupled simulation with these changes for us to explore? Also can you provide diagnostics for this simulation? Finally, can you confirm that these changes pass SMS, PET, PEM and ERS tests in a B-case?
@proteanplanet I don't have a coupled simulations done with this PR yet but I plan to submit one following my previous Bluetip simulation. This PR passed the e3sm_land_developer test suite which includes 50+ tests on Compy with some Namelist changes and Throughput changes. See the attached test results.
test_results.txt
Those test results don't have any PET or PEM tests. Try PET.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP and PEM.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP
@rljacob The PET.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP simulation failed on Compy with following error message:
Opened existing file
/compyfs/inputdata/share/domains/domain.lnd.ne4pg2_oQU240.190321.nc 23
lat/lon grid flag (isgrid2d) is F
ncd_inqvid: variable LANDMASK is not on dataset
decompInit_lnd(): Number of clumps exceeds number of land grid cells
320 211
ENDRUN:
ERROR in decompInitMod.F90 at line 183
It is strange as I did not modify the land model in this PR. Any ideas? Should I try it on Chrysalis instead?
Yes try chrysalis. There may not be a good pelayout for that case on compy.
I ran a PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test and it failed the comparison between the two runs. The PEM_Ln9.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test that's in e3sm_integration passes, but since it's only running 9 steps mosart only runs once in that test
I ran a PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test and it failed the comparison between the two runs. The PEM_Ln9.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test that's in e3sm_integration passes, but since it's only running 9 steps mosart only runs once in that test
My PET.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP passed, but the PEM.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP failed on comparison as well, because the 2nd run couldn't complete. I increased the walltime to 2 hours (maximum for a debug queue on Chrysalis?) but the simulation appeared to stall at some point. Then I tested the baseline (https://github.com/E3SM-Project/E3SM/commit/64046ec75587d9fcd035f22553192665dd540f56) and failed at the same point.
Thanks @hydrotian -- I checked and both runs for my PEM test completed fine, just had different results. I'm running a similar PET test right now
@jonbob Thanks. Could you share the cprnc.out report? I want to see which fields are different between the two runs.
Sure, but after five days it ends up with 351 out of 507 fields different. It's at:
/lcrc/group/acme/ac.jwolfe/scratch/chrys/PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.20251022_120245_ruutak/PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.20251022_120245_ruutak.cpl.hi.0001-01-06-00000.nc.base.cprnc.out
OK, the similar PET test (PET.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel) passed
Thanks, @jonbob. Any insights about the PEM test fail? Would you mind doing a same PEM test for the baseline master where I branched from (https://github.com/E3SM-Project/E3SM/commit/64046ec75587d9fcd035f22553192665dd540f56)?
No insights from the PEM test -- we would have to do one where we tried to catch the first field that gets different answers. @proteanplanet noticed that you have a routine for sort_outlets_by_discharge_desc but we couldn't see it getting called?
Yes. That was from an earlier commit on this branch. I can clean it up.
To get a better idea of when it diffs, change the river coupling frequency to match the other models. That might allow you to go back to a 9 nstep test. Also change the coupler history output to be every timestep.
@hydrotian -- I set redirect_negative_qgwl = .false. in your branch and the PEM test passes
The PEM test has passed now. Both @jonbob and I confirmed that on our separate tests.
Status: waiting for climate tests to see impact.
A 10-year fully-coupled simulation with this quick-fix based on v3.LR.piControl has completed. It is confirmed that there's zero negative runoff passed to the ocean from the land.
The monthly river discharge comparison at the river outlet (To Ocean) and the last gridcell before the outlet (Over Land) for the 'quick-fix' run shows that the water amount been redirected due to this quick-fix is negligible to the river discharge:
The river discharge comparison between this simulation and the baseline at major river outlets shows that the negative runoff redirection did create some impact to the regional hydrology.
I also tested this PR with the new flag on and it passed:
- ERP_Ld3.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.allactive-pioroot1