E3SM
E3SM copied to clipboard
Fix C balance and FPE errors
Carbon balance was never checked correctly and let through negative values.
Initialization of Ecosystem variables needed to be re-adjusted, and balance checking for FATES needed to be changed, which causes round-off differences in CMASS_BALANCE_ERROR
for fate_cold_allvars
test.
Also included divide-by-zero check in PhenologyMod.F90 to fix fpe in debug mode.
[non-BFB] for one FATES test. Fixes #6120 Fixes #6203 Fixes #6177
@glemieux Here is a PR to fix the balance checking issues. I did get DIFF for fates_cold_allvars
but only for CMASS_BALANCE_CHECK
but the values were still ~1E-16
.
Let me know if how I implemented the new balance checking makes sense to you.
After doing a test merge I obtained FILLDIFF for the fates_cold_allvars
. That I need to look into today
Quick update: fates_cold_allvars
FAILs at the COMP_base_rest
stage. The cprnc file just shows CMASS_BALANCE_ERROR
with very small differences. Since there are no other diffs (i.e., in variables used to calculate carbon balance), I wonder if it's just due to floating point rounding issues. But, if that is the case, I'm not sure what the solution should be.
CMASS_BALANCE_ERROR (lndgrid,time) t_index = 10 10
127 384 ( 25, 1) ( 93, 1) ( 148, 1) ( 148, 1)
211 0.000000000000000E+00 -2.810582373931857E-16 2.3E-21 -1.292301226966941E-16 2.0E-06 -1.292301226966941E-16
211 0.000000000000000E+00 -2.810570991926628E-16 -1.292278595305381E-16 -1.292278595305381E-16
384 ( 25, 1) ( 93, 1)
avg abs field values: 1.589865621873051E-16 rms diff: 0.0E+00 avg rel diff(npos): 2.0E-06
1.589865886570847E-16 avg decimal digits(ndif): 5.7 worst: 4.8
RMS CMASS_BALANCE_ERROR 0.0000E+00 NORMALIZED 0.0000E+00
What is the diff with? A baseline from the previous code or is it a diff within a test like an ERS test?
Not baseline diff (though that is also there but makes sense to me). It fails the COMP_base_rest portion of ERP test (same thing with the ERS version I made for debugging).
Try a REP test.
REP passes just fine. I'm going to double check that each variable for carbon balance is read/written to the restart file, and maybe a longer simulation will reveal diff's accruing in other fields.
status:
fates_cold_allvars
still fails restart test. Tried adding every gridcell and column variable related to CMASS_BALANCE to history and restart files but no other differences. Tried running test for 2 years to accumulate more carbon, but there was no other diffs detected.
Adjusting the history output interval did allow the test to pass the restart comparison but that's not a real solution.
Scheduled meeting with FATES devs to go over the logic for balance checking with FATES enabled.
@peterdschwartz what is the status of this PR?
@rljacob I met with @glemieux a couple of weeks ago. He mentioned a possible time-step dependent part of FATES that could be the cause, but I think he's been busy with the FATES api update and our follow up meeting with @thorntonpe was cancelled due to lab holiday.
Today, I'll try increasing the history output to be every timestep and see if that provides any useful information.
Update:
Increasing output freq finally showed a new diff in a field. Looking into it further, it appears that the vertical transport tendency terms for carbon are not added to the corresponding leaching term due to some issue 5 years ago (according to git blame atleast), but it is for nitrogen and phosphorus.
Hopefully the straightforward solution works here.
LITR2C_TNDNCY_VERT_TRANS (lndgrid,levdcmp,time) t_index = 194 194
3 5760 ( 153, 2, 1) ( 153, 1, 1) ( 237, 3, 1) ( 247, 10, 1)
3165 1.043116371057806E-09 -1.643725044964128E-09 2.2E-19 2.944499461962624E-12 8.1E-11 1.293930177203629E-16
3165 1.043116371057806E-09 -1.643725044964128E-09 2.944499678803059E-12 1.293930309552527E-16
5760 ( 153, 2, 1) ( 153, 1, 1)
avg abs field values: 4.034842251976478E-11 rms diff: 0.0E+00 avg rel diff(npos): 8.1E-11
4.034842251976478E-11 avg decimal digits(ndif): 7.1 worst: 7.0
RMS LITR2C_TNDNCY_VERT_TRANS 0.0000E+00 NORMALIZED 0.0000E+00
@peterdschwartz can you point me to the line of code with the git blame that you noted earlier, please?
@glemieux This Code block is what I was referring to. The transport tendency is not added to carbon leached pool even though the corresponding terms are added for Nitrogen and Phosphorus.
More experimenting showed that bypassing the SoilLittVertTransp calculation, the restart failure goes away. For FATES, the UpdateLitterFluxes
is done just prior that adjusts the inputs to the SoilLittVertTransp
that could be the cause. Line
Waiting to see about FATES side of this PR.
Since there's a previous fates related issue linked above with the same errors, I'm going to just clean the branch up for the other issues and follow the review comments. The other issue can be used to the issue.
Since there's a previous fates related issue linked above with the same errors, I'm going to just clean the branch up for the other issues and follow the review comments. The other issue can be used to the issue.
Sorry for the delay in trying to address the fates-side of this. Just to clarify, you're saying your going to move forwards towards integration @peterdschwartz? If so, that sounds good to me.
Since there's a previous fates related issue linked above with the same errors, I'm going to just clean the branch up for the other issues and follow the review comments. The other issue can be used to the issue.
Sorry for the delay in trying to address the fates-side of this. Just to clarify, you're saying your going to move forwards towards integration @peterdschwartz? If so, that sounds good to me.
@glemieux Yes, and now I know how to how to reliably reproduce the issue, we can use #6125 to track progress. I will share a branch with the changes to do it.
@bishtgautam I have rebased the branch and addressed your comments. There's potentially a couple of different ways to handle the else
condition in the PhenologyMod, so I went with what I believe to be the simplest and added a comment explaining why. Let me know what you think
e3sm_developer came back [BfB] with the exception of one FATES test that will FAIL restart comparison for the time being.
This is ready to merge.