CTSM icon indicating copy to clipboard operation
CTSM copied to clipboard

Adding history variables for use with Newton-Krylov

Open slevis-lmwg opened this issue 4 years ago • 42 comments

Description of changes

decomp_k already had the necessary infrastructure to appear in ctsm history. It's an "inactive" variable and appears in the documentation's master list as K_*. It was dimensioned wrong in the code, so I corrected that in my first commit to this PR, thankfully without changing answers.

Next I will add pathfrac_decomp_cascade to history.

Specific notes

Contributors other than yourself, if any: @wwieder @ekluzek

CTSM Issues Fixed (include github issue #): #1455 #1825

Are answers expected to change (and if so in what way)? No. Variable pathfrac_decomp_cascade will appear in history optionally.

Any User Interface Changes (namelist or namelist defaults changes)? No. Later we may consider interface changes to add these two inactive variables to history when spinning up with Newton-Krylov.

Testing performed, if any: The following test PASSES with the mods of the first commit to this PR: ERP_P36x2_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.cheyenne_gnu.clm-extra_outputs -c /glade/p/cgd/tss/ctsm_baselines/ctsm5.1.dev051

slevis-lmwg avatar Aug 09 '21 23:08 slevis-lmwg

Cheyenne test-suite OK

slevis-lmwg avatar Aug 11 '21 01:08 slevis-lmwg

Izumi test-suite OK

slevis-lmwg avatar Aug 11 '21 22:08 slevis-lmwg

Running tests that compare against dev053: PASS ERP_P36x2_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.cheyenne_gnu.clm-extra_outputs ΟΚ Cheyenne test-suite OK Izumi test-suite PASS ERP_P36x2_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.cheyenne_gnu.clm-extra_outputs with user_nl_clm modified to include some of the new variables in history:

hist_fincl1   += 'FPI_vr', 'K_ACT_SOM', 'K_CEL_LIT', 'K_CWD',
                 'CWD_PATHFRAC_L2_vr', 'CWD_RESP_FRAC_L2_vr',
                 'L1_PATHFRAC_S1_vr', 'L1_RESP_FRAC_S1_vr',
                 'S1_PATHFRAC_S3_vr', 'S1_RESP_FRAC_S3_vr'

I will update this post as tests complete.

slevis-lmwg avatar Aug 19 '21 21:08 slevis-lmwg

Thanks for these changes @slevisconsulting . I just wanted to confirm that in this PR are the additional variables still set to inactive by default with the plan of manually turning them on as we're testing the N-K output?

wwieder avatar Aug 19 '21 22:08 wwieder

in this PR are the additional variables still set to inactive by default with the plan of manually turning them on as we're testing the N-K output?

@wwieder that is correct. The last test listed a few posts up will confirm that namelist variable fincl1 adds these variables to history. We also have the option of changing these variables to "active" if that's the preferred solution.

slevis-lmwg avatar Aug 19 '21 23:08 slevis-lmwg

Notes from today's mtg:

  • @wwieder will clone the branch from this PR and perform a single-point cold-start for 20 yrs at a boreal forest location w AD mode off. He will then continue the run for another 20 yrs or, as @klindsay28 recommended, create a new case that continues for another 20 years.
  • Need column-level history of the necessary variables averaged over the full 20 years.
  • Need restart files from the end of each 20-yr cycle.
  • Before @wwieder starts the runs, @slevisconsulting will add a user_nl_clm to the new directory that Sam will create here: cime_config/usermods_dirs. Sam's file will include the necessary fincl and other hist namelist variables.

slevis-lmwg avatar Aug 20 '21 19:08 slevis-lmwg

@slevisconsulting just add a subdirectory to cime_config/usermods_dirs/ with a name something like newton_krylov_spinup. You can add the user_nl_clm there, as well as a shell_commands file if you need to do any xml settings (you should set it to a cold start, so you'll want that).

ekluzek avatar Aug 20 '21 20:08 ekluzek

Thanks @slevisconsulting dumb question here, but what do I need to do when creating a case to get these user_mods to be active? Or will they just be included by default?

wwieder avatar Aug 23 '21 15:08 wwieder

@wwieder this is done similar to the user-mods used for NEON. It's an option to create_newcase. Use ./create_newcase --help to remind yourself of the options. But, it would be using the "--user-mods-dirs newton_krylov_spinup" option.

ekluzek avatar Aug 23 '21 17:08 ekluzek

OK, questions here are moving outside of this issue / PR, but what's the most effective way to create a case for a random point and populate the datm.streams file correctly?

you can see my initial attempts here /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/scripts/RandomBoreal_0

  <file>/glade/scratch/wwieder/single_point/datmdata/clmforc.GSWP3.c2011.0.5x0.5.Prec.RandomBoreal.1901-01.nc</file>

wwieder avatar Aug 23 '21 22:08 wwieder

what's the most effective way to create a case for a random point and populate the datm.streams file correctly?

you can see my initial attempts here /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/scripts/RandomBoreal_0

  <file>/glade/scratch/wwieder/single_point/datmdata/clmforc.GSWP3.c2011.0.5x0.5.Prec.RandomBoreal.1901-01.nc</file>

I didn't look at your case directory, but the quickest/simplest approach that I think should work is to point to the existing global atm files rather than generate new ones for this boreal point. The model will run slower due to interpolation calculations, but you won't have to generate a whole new atmosphere dataset. But @ekluzek may have another suggestion.

slevis-lmwg avatar Aug 23 '21 23:08 slevis-lmwg

sorry to pester here, @ekluzek, but can you help make suggestions for how to set up a generic single point (58N, 55E) with less headache than I did here /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/scripts/RandomBoreal_0

wwieder avatar Aug 24 '21 23:08 wwieder

Hey @wwieder OK so I looked at your case, and also created a case to model from. Two issues in your case is that you need to set CLM_USRDAT_NAME to something that refers to the site you are working on. Using your nomenclature I used "RANDOMBOREAL1". The other thing is the compset you are using which is configured to get tower site data (that's what the 1Pt refers to in the name). So you want a regular compset name so that you get the global forcing data. You also set the site location with PTS_LAT and PTS_LON xml variables.

So my example case is here:

/glade/work/erik/ctsm_worktrees/branch1/cime/scripts/cases/RandomBoreal_0

With createnew case command:

./create_newcase --case cases/RandomBoreal_0 --compset I2000Clm51Bgc --res CLM_USRDAT --driver nuopc --user-mods-dirs newton_krylov_spinup --mpilib mpi-serial --run-unsupported (note I used the mpi-serial library for MPI because it's just a single point case)

I did these xml commands (added them to shell_commands)

./xmlchange CLM_USRDAT_NAME="BOREAL1"
./xmlchange PTS_LON=55.
./xmlchange PTS_LAT=57.958115183246
./xmlchange CASESTR="Random Boreal"

NOTE: CASESTR is not required, but makes sure a few things are set with something besides UNSET

Added this to user_nl_clm

fsurdat = '/glade/scratch/wwieder/single_point/surfdata_hist_16pfts_Irrig_CMIP6_simyr2000_RandomBoreal_c210823.nc'

and this to user_nl_mosart

frivinp_rtm = '/dev/null'

(This is something that is required with latest updates in externals, when MOSART_MODE=null (i.e. when compset has MOSART rather than SROF). This is a little glitch that we should fix in CTSM so you don't have to do this (There's a MOSART issue about this)

ekluzek avatar Aug 25 '21 07:08 ekluzek

Thanks Erik. After submitting, now I'm getting this: ERROR: model_meshfile UNSET does not exist

Do you have any suggestions here? my case is here /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/scripts/RandomBoreal_0

looks like this is true in your case too /glade/work/erik/ctsm_worktrees/branch1/cime/scripts/cases/RandomBoreal_0/CaseDocs/datm_in

wwieder avatar Aug 25 '21 12:08 wwieder

Hi @wwieder. Hmmm. Actually there's a simple step that I left out, that you need to do.

My case did run even with that being UNSET. I think that's actually normal. Although it probably means we should clean this up and give it some value like "UNUSED" to make it more clear that it's not needed.

From comparing my case to yours, the differences I see are:

PTS_LON,PTS_LAT, and CLM_USRDAT_NAME aren't set in your case, which are the critical things to make sure are set correctly. Oh, and I see that you added these settings to your ./shell_commands -- but they weren't invoked by the case. So that just means you need to run ./shell_commands manually. That's an important step that I neglected to say. The case does run shell_commands, itself, but it must be either before case.setup or with case.setup....

./shell_commands

ekluzek avatar Aug 25 '21 16:08 ekluzek

Thanks @ekluzek that's what I needed to do! We'll have to work on some new documentation for this somewhere. I'll discuss with @danicalombardozzi where this should go.

wwieder avatar Aug 25 '21 19:08 wwieder

Last question, how should can I share the build from the first case with another nearly identical case?

wwieder avatar Aug 25 '21 19:08 wwieder

can I share the build from the first case with another nearly identical case?

I haven't used this in a long while, but try: ./create_clone --case new_case_name --clone existing_case_name

slevis-lmwg avatar Aug 25 '21 19:08 slevis-lmwg

Thanks @ekluzek that's what I needed to do! We'll have to work on some new documentation for this somewhere. I'll discuss with @danicalombardozzi where this should go.

Happy to discuss -- just let me know when!

danicalombardozzi avatar Aug 25 '21 21:08 danicalombardozzi

@wwieder the important argument to create_clone is "--keepexe" which will then shared the built case exe with the new case. There's some notes on it and help with

./create_clone --help

ekluzek avatar Aug 25 '21 21:08 ekluzek

@wwieder @klindsay28 I have updated the branch to the latest tag, as @wwieder suggested. See if you can start over creating the case that worked before (was it random_boreal?).

slevis-lmwg avatar Nov 23 '21 17:11 slevis-lmwg

I created a clone that still won't build. Error below.

Building CDEPS with output to file /glade/scratch/wwieder/RandomBoreal_0b/bld/CDEPS.bldlog.211123-110531 Calling /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/components/cdeps/cime_config/buildlib - Building clm library Building lnd with output to /glade/scratch/wwieder/RandomBoreal_0b/bld/lnd.bldlog.211123-110531 /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/../src/utils/clmfates_interfaceMod.F90(858): error #6460: This is not a field name that is defined in the encompassing structure. [HLM_SP_TLAI]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/../src/utils/clmfates_interfaceMod.F90(859): error #6460: This is not a field name that is defined in the encompassing structure. [HLM_SP_TSAI]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/../src/utils/clmfates_interfaceMod.F90(860): error #6460: This is not a field name that is defined in the encompassing structure. [HLM_SP_HTOP]

Component lnd build complete with 8 warnings clm built in 145.318698 seconds ERROR: BUILD FAIL: clm.buildlib failed, cat /glade/scratch/wwieder/RandomBoreal_0b/bld/lnd.bldlog.211123-110531

wwieder avatar Nov 23 '21 18:11 wwieder

So sorry... I forgot to run checkout_externals! I ran it now, and I hope that's it!

Sam

On Tue, Nov 23, 2021 at 10:19 AM will wieder @.***> wrote:

I created a clone that still won't build. Error below.

Building CDEPS with output to file /glade/scratch/wwieder/RandomBoreal_0b/bld/CDEPS.bldlog.211123-110531 Calling /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/components/cdeps/cime_config/buildlib

  • Building clm library Building lnd with output to /glade/scratch/wwieder/RandomBoreal_0b/bld/lnd.bldlog.211123-110531 /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/../src/utils/clmfates_interfaceMod.F90(858): error #6460: This is not a field name that is defined in the encompassing structure. [HLM_SP_TLAI]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/../src/utils/clmfates_interfaceMod.F90(859): error #6460: This is not a field name that is defined in the encompassing structure. [HLM_SP_TSAI]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/cime/../src/utils/clmfates_interfaceMod.F90(860): error #6460: This is not a field name that is defined in the encompassing structure. [HLM_SP_HTOP]

Component lnd build complete with 8 warnings clm built in 145.318698 seconds ERROR: BUILD FAIL: clm.buildlib failed, cat /glade/scratch/wwieder/RandomBoreal_0b/bld/lnd.bldlog.211123-110531

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESCOMP/CTSM/pull/1457#issuecomment-976969824, or unsubscribe https://github.com/notifications/unsubscribe-auth/AINPYFFLOTZQHYBQF6NNULDUNPLMTANCNFSM5B27YDAQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

slevis-lmwg avatar Nov 23 '21 21:11 slevis-lmwg

Ok, still no dice here.

I checked out a new branch from your fork, Sam. git checkout -b NK_test2 origin/hist_vars_for_newton_krylov

I also checked out externals and then created a new case ./create_newcase --case RandomBoreal_0b --compset I1850Clm51Bgc --res CLM_USRDAT --driver nuopc --user-mods-dirs newton_krylov_spinup --mpilib mpi-serial --run-unsupported

but the model still cannot build Building lnd with output to /glade/scratch/wwieder/RandomBoreal_0b/bld/lnd.bldlog.211124-063014 /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/src/soilbiogeochem/SoilBiogeochemDecompCascadeCNMod.F90(14): error #6580: Name in only-list does not exist or is not accessible. [USE_VERTSOILC]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/src/soilbiogeochem/SoilBiogeochemDecompCascadeCNMod.F90(614): error #6404: This name does not have a type, and must have an explicit type. [USE_VERTSOILC]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/src/soilbiogeochem/SoilBiogeochemDecompCascadeCNMod.F90(614): error #6341: A logical data type is required in this context. [USE_VERTSOILC]

/glade/work/wwieder/ctsm/ctsm5.1_N-K_test/src/soilbiogeochem/SoilBiogeochemDecompCascadeCNMod.F90(799): error #6341: A logical data type is required in this context. [USE_VERTSOILC]

Component lnd build complete with 8 warnings clm built in 166.155986 seconds ERROR: BUILD FAIL: clm.buildlib failed, cat /glade/scratch/wwieder/RandomBoreal_0b/bld/lnd.bldlog.211124-063014

wwieder avatar Nov 24 '21 14:11 wwieder

@slevisconsulting I think there might be some changes you'll have to make on your branch due to things I did in some of my recent tags. I removed use_vertsoilc, so you'll have to make some changes in your code to take that into account. It might not have shown up as a conflict, but you'll need to resolve it before this will work. So you'll probably want to get at least one case working, and then you might want to run the full testing on it to make sure things are working well.

ekluzek avatar Nov 24 '21 22:11 ekluzek

The error that @wwieder posted shows that my branch still included SoilBiogeochemDecompCascadeCNMod.F90, which it shouldn't have. I think you're right @ekluzek that this was an undetected conflict that slipped through the cracks.

So I've gone ahead and git removed SoilBiogeochemDecompCascadeCNMod.F90 and successfully completed a run with this global case: ./create_newcase --case ~/cases_mimics/newton_krylov_global --compset I1850Clm51Bgc --res f10_f10_mg37 --driver nuopc --user-mods-dirs newton_krylov_spinup --mpilib mpi-serial --run-unsupported

@wwieder and @klindsay28 let's keep iterating if you still encounter problems...

slevis-lmwg avatar Nov 26 '21 18:11 slevis-lmwg

To elaborate a bit on the conflict that slipped through the cracks: When I updated my branch to dev062, I added a comment to the commit saying that I had resolved a conflict with SoilBiogeochemDecompCascadeCNMod.F90. I thought I had resolved it by accepting the file as removed, but now I wonder whether I actually typed git rm at the time. I guess I probably didn't.

In any case, now I have, and the code runs for me.

slevis-lmwg avatar Nov 26 '21 18:11 slevis-lmwg

I'm still unable to build after checking out your latest code @slevisconsulting

do I need to manage externals again (this was not done).

Here's the error I'm getting now. Building lnd with output to /glade/scratch/wwieder/RandomBoreal_0c/bld/lnd.bldlog.211129-113356 /glade/work/wwieder/ctsm/ctsm5.1_N-K_test/src/utils/clmfates_interfaceMod.F90(2218): error #6460: This is not a field name that is defined in the encompassing structure. [FCANSNO_PA]

Component lnd build complete with 8 warnings clm built in 143.874794 seconds ERROR: BUILD FAIL: clm.buildlib failed, cat /glade/scratch/wwieder/RandomBoreal_0c/bld/lnd.bldlog.211129-113356

wwieder avatar Nov 29 '21 18:11 wwieder

do I need to manage externals again (this was not done).

@wwieder I do recommend running ./manage_externals/checkout_externals since you didn't, yet. Also I don't think there's harm in rerunning it, even if you had already done so.

I hope this fixes it... I don't know what to think of, yet, if not...

slevis-lmwg avatar Nov 29 '21 19:11 slevis-lmwg

+1 @slevisconsulting updating externals seems to have done the trick (at least for the build). Assuming all looks good, I'll ket @klindsay28 where he can find a case to clone that actually works again.

wwieder avatar Nov 29 '21 21:11 wwieder