Methane conservation error in CH4Mod doing a hybrid restart with CISM%EVOLVE over Greenland
Describe the bug I am doing a hybrid restart to turn on CISM over Greenland, branching from year 101 of the picontrol run n1850.ne30_tn14.nor3_b01-cplhist-noLU.20250716(https://github.com/NorESMhub/noresm3_dev_simulations/issues/194). The model run for one year, then crashes right after restarting due to a methane conservation error in in CH4Mod.F90. I've tried to repeat the same procedure without turning on CISM (i.e., doing a hybrid restart with the exact same setup as in n1850.ne30_tn14.nor3_b01-cplhist-noLU.20250716), and in this case the model runs 5 years with no issues.
- NorESM version: noresm3_0_beta01
- HPC platform: betzy
- Compiler: intel
- Compset: 1850_CAM70%LT%NORESM%CAMoslo_CLM60%FATES_CICE_BLOM%HYB%ECO_MOSART_CISM2%GRIS-EVOLVE_SWAV_SESP
- Resolution: ne30pg3_tn14_gris4
- Error message: From cesm.log: Gridcell-level CH4 Conservation Error in CH4Mod driver From ESMF_log: Methane conservation errorERROR in ch4Mod.F90 at line 2352
To Reproduce Steps to reproduce the behavior:
-
Get noresm3_0_beta01
-
/cluster/projects/nn11022k/mpet/NorESM/Repository/noresm3_0_beta01/cime/scripts/create_newcase --case "${CASEDIR}" --compset 1850_CAM70%LT%NORESM%CAMoslo_CLM60%FATES_CICE_BLOM%HYB%ECO_MOSART_CISM2%GRIS-EVOLVE_SWAV_SESP --res ne30pg3_tn14_gris4 --machine betzy --project nn11022k --q normal --walltime 48:00:00 --pecount L --run-unsupported --compiler intel --user-mods-dir /cluster/projects/nn11022k/mpet/NorESM/Repository/noresm3_0_beta01/cime_config/usermods_dirs/reduced_out_devsim/
-
./xmlchange RUN_TYPE=hybrid ./xmlchange RUN_REFDIR=/cluster/projects/nn11022k/mpet/cmip7_testrestart_grisonly/restarts/n1850.ne30_tn14.nor3_b01-cplhist-noLU.20250716/0101-01-01-00000 ./xmlchange RUN_REFCASE=n1850.ne30_tn14.nor3_b01-cplhist-noLU.20250716 ./xmlchange RUN_REFDATE=0101-01-01 ./xmlchange RUN_STARTDATE=0101-01-01 ./xmlchange STOP_N=5 ./xmlchange STOP_OPTION=nyears ./xmlchange REST_OPTION=nyears ./xmlchange REST_N=1 ./xmlchange GLC_AVG_PERIOD=yearly
./case.setup ./case.build
-
Set cpl, cam, and clm namelists as in https://github.com/NorESMhub/noresm3_dev_simulations/issues/194. Set the following cism namelist: #CISM-Greenland-only cisminputfile = '/cluster/projects/nn11022k/mpet/dataset/new_cism_grids/gris/inputfiles/Greenland_4km.init.c27022025.nc' nsn=721 ewn=421 adjust_input_thickness = .false. bmlt_float = 6 bmlt_float_thermal_forcing_param = 0 bmlt_float_ismip6_magnitude = 1 isostasy = 0 limit_marine_cliffs = .false. marine_margin = 1 calving_minthck = 100. calving_timescale = 1 ocean_data_domain = 2 ocean_data_extrapolate = 1 remove_icebergs = .true. remove_isthmuses = .false. flow_factor_float = 1.0 gamma0 = 0 block_inception = .true. force_retreat = 1 restart = 0 nzocn = 30 dzocn = 60. esm_history_vars = "smb artm thk usurf topg uvel vvel temp bmlt bwat beta_internal floating_mask grounded_mask bpmp acab_applied bmlt_applied calving_rate iareaf iareag imass imass_above_flotation total_smb_flux total_bmb_flux total_calving_flux total_gl_flux ice_sheet_mask ice_cap_mask thermal_forcing thermal_forcing_lsrf" dt = 0.1 dt_diag = 0.1 EOF
Case folder: /cluster/projects/nn11022k/mpet/cmip7_testrestart_grisonly/n1850.ne30_tn14_gl4_testrestart1 Output: /cluster/work/users/mpet/noresm/n1850.ne30_tn14_gl4_testrestart1/run
@hgoelzer @mvdebolskiy @mvertens @gold2718
Hi @mpetrini-norce, The error is coming from the land methane conservation, however, we've had this type of errors before (with the extreme aerosol burst bug), and then it was really nothing todo with methane conservation really, only that the methane conservation code is the first to complain about non-sensical outputs (like negative forcing etc...) That doesn't mean that I know what is causing this, but it may in fact be CISM even if you don't think so from the error messages...
@mpetrini-norce I do not have access to your case folder, however, judging by the logs, the model have run 1 year, wrote a restart file and restarted.
Can you dump the CaseStatus here?
Thanks @maritsandstad and @mvdebolskiy for the replies. I've copied the case folder here /cluster/projects/nn9560k/mpet/n1850.ne30_tn14_gl4_testrestart1. Below the CaseStatus:
2025-07-31 13:57:19: xmlchange success
Here I tried to manually restart the run, but failed with the same error message. So you're right Matvey, it crashes after restarting. I'll correct.
2025-07-31 20:24:53: xmlchange success
Update on this bug, found a temporary fix:
The model completed a 5 years run with GrIS active and methane turned off (use_lch4 = .false.), however the same crash occurs when methane is subsequently turned on in another hybrid run restarting from GrIS_active-methane_off. @mvdebolskiy noted that the conservation errors become smaller when running for longer time (15 years) with methane turned off, so one option could be to extend the GrIS_active-methane_off run to see if at some point the error goes away.
Another strategy that is working for now, but probably not ideal in the long-term, is to use a patch similar to the case with lake area changes (see https://github.com/ESCOMP/CTSM/issues/43): that is, to skip the methane conservation check if dynamic glaciers are on and we are at the beginning of the year or at the beginning of a simulation (code below at line 2327 in biogeochem/ch4Mod.F90):
With this patch, the model could complete a 5 years run with GrIS active and methane turned on, restarting from the n1850.ne30_tn14.nor3_b01-cplhist-noLU.20250716 run (https://github.com/NorESMhub/noresm3_dev_simulations/issues/194). More discussion will follow to understand if we can find a cleaner fix (@mvdebolskiy will keep looking into that) and/or if this patch is acceptable as it was for the lake area changes case.