DART Can coupler history (forcing) files from the CAM6 Reanalysis be used in CESM3?

The implementation of the NUOPC mediator in CESM3 (replacing CESM2's coupler) has resulted in the forcing files and the variables in them getting new names. This raises questions about whether the Reanalysis files (CESM2.1) can be fed to CESM for assimilations using the CESM3 surface models; CTSM, MOM6, CICE, ... It appears that they can

I've attached a list of the stream files from the case referenced there, but I don't have much experience looking at the contents of stream files, so it would be helpful for an expert or 2 to do that. keerzhang_user_nl_datm_streams.txt

In the user_nl_datm_streamfile file I see: CPLHISTForcing.State1hr:datavars=a2x1h_Sa_u Sa_u, a2x1h_Sa_v Sa_v but in a similar file it was CPLHISTForcing.State1hr:datavars=a2x1h_Sa_u u, a2x1h_Sa_v v I don't know whether those are equivalent.

I also haven't tested them. I'm hoping that it will be simple for someone who runs a surface model to try this in CESM3. It wouldn't require any assimilation.

We will probably want to add files and documentation to describe how to use the old forcing files.

May 27 '25 19:05 kdraeder

I've attached a screen shot of lists of variables from one type of forcing file from CESM2.1 and CESM3, with key differences highlighted. There are 4 more types of forcing files, which may have more or fewer differences.

Note that the default CAM in CESM3 is the spectral element version, so the "grid" dimensions are very different.

May 27 '25 21:05 kdraeder

@kdraeder It makes sense for me to test this with CTSM (CESM3). As you say the compatibility with the reanalysis files looks like it should work with CESM3, I haven't made any NUOPC related updates to the current CTSM scripting (CESM2). Not sure how complicated that will be.

May 28 '25 14:05 braczka

@braczka Thanks for volunteering! Let me know if I might have helpful context.

I believe, to first order, there shouldn't need to be changes to DART scripting due to NUOPC. Filter is still communicating via files.
There may be some namelist changes related to the mediator, and the usual kinds of updates when dealing with new component versions.

May 28 '25 17:05 kdraeder

Based on software standup discussion I will take a look at this after the ESPAT meeting (late June).

May 29 '25 17:05 braczka

@braczka , @hkershaw-brown suggested that the variable names might be defined in a dictionary, which a user could modify to use the old names. It appears that the contents of the stream files serve as the dictionary.
models/clm/shell_scripts/cesm2_2/datm.streams.txt.CPLHISTForcing.State3hr_single_year has lists like

    <fileNames>
         f.e21.FHIST_BGC.f09_025.CAM6assim.011.cpl_NINST.ha2x3h.RUNYEAR.nc

    <fieldInfo>
        <variableNames>
           a2x3h_Sa_z           z
           a2x3h_Sa_tbot        tbot
     ...

The discussion in "It appears that they can" (first comment in this issue) concluded that using stream files with the old names is an easier option than changing the file and variable names of the forcing files to match CESM3 conventions. It also pointed out that 'offsets' consistent with the dataset should be used, so we should try using the existing stream files in the test, rather than figuring out what would need to be exported to stream file templates from CESM3.

Jul 24 '25 22:07 kdraeder

Yes -- I am aware of the streamlist template files. When would you like this testing completed? I may have missed this in our last standup.

Jul 25 '25 14:07 braczka

It hasn't come up since July 17, so you didn't miss anything. I'm just working through my open issues, and I wanted to know more about this, so I dug into it and am recording what I found. Just to be clear and short, I'm proposing that we try to use DART's existing stream files, and ignoring CESM3 stream files and templates, unless we need something from them.

Based on recent information, it's not very high priority. How about by the end of August?

Jul 25 '25 16:07 kdraeder

@braczka I just rediscovered a note to myself that I should ask you about running a test using the forcing files from near the end of 2020. If that's as easy as any other time, that would test 2 things. (I ran 2020 separately from the other years, so it's good to test it.) If you already have a standard test period case set up, then it makes sense to use that instead.

Jul 25 '25 20:07 kdraeder

I have successfully performed a CLM land only simulation with the CAM6 reanalysis files using the CESM3 (nuopc coupler). I performed a hybrid run (starting with a 1 year coldstart initial condition) using the ctsm5.3.021 tag with compset 2000_DATM%GSWP3v1_CLM60%BGC-CROP_SICE_SOCN_MOSART_SGLC_SWAV. See my case directory ( /glade/work/bmraczka/cases/ctsm5.3.021/clm6_hybrid_e5) and run directory (/glade/derecho/scratch/bmraczka/ctsm5.3.021/clm6_hybrid_e5/run).

To read in an old style CAM6_reanalysis file into the new coupler requires a custom user_nl_datm file based on the land ctsm forum issue. I created a user_nl_datm.CPLHISTForcing.complete template file, which creates the user_nl_datm_streams_{INST} files within the case directory. This then is interpreted by the code to generate the datm.streams_{INST}.xml files used at run time.

I ran into a couple minor sticking points. First, there exists two datm stream files in the case folder. The information must be passed into the user_nl_datm_streams file as described in an issue I posted to the ctsm forum. Second, the CPLHIST namelist variables have changed slightly in the cesm3 versions (e.g. DATM_CPLHIST_YR_START).

The scripts I used to generate the simulation are within the case directory: CLM6_hybrid_freerun.original and DART_params.csh.

Sep 23 '25 22:09 braczka

@kdraeder -- Just remembered your request for testing forcing file for 2020 separately. I can easily do that next. This current test works with 2011 data only.

Sep 23 '25 23:09 braczka

@kdraeder -- Just remembered your request for testing forcing file for 2020 separately. I can easily do that next. This current test works with 2011 data only.

That would be reassuring for me, not not essential.
I believe that the motivation was to test the transition from 2019 forcing, which I created in 2020, to 2020 forcing, which I created in a later year after resurrecting the case on derecho.

Also, thanks for the good news about the test! I'm glad that it wasn't too much trouble.

Sep 24 '25 00:09 kdraeder

That would be reassuring for me, not not essential. I believe that the motivation was to test the transition from 2019 forcing, which I created in 2020, to 2020 forcing, which I created in a later year after resurrecting the case on derecho.

Also, thanks for the good news about the test! I'm glad that it wasn't too much trouble.

I will perform a mid year 2019 to mid year 2020 test to look at this transition. It's a simple update to the scripting.

Sep 24 '25 14:09 braczka

I completed a Jan-2019 through June-2020 global simulation using the identical setup (i.e. compset, ctsm tag) to test the 2019 to 2020 CAM6 reanalysis transition. It seemed to work fine without any strange transitions. See case (/glade/work/bmraczka/cases/ctsm5.3.021/clm6_hybrid_e3_2020) and run directory ( /glade/derecho/scratch/bmraczka/ctsm5.3.021/clm6_hybrid_e3_2020/run) for more information.
Here are some monthly snapshots of TLAI (total leaf area, units 1-9 m2/m2) and PARVEGLN (absorbed photosynthetic radiation, units 1-500 W/m2). Overall, they look pretty normal given initial conditions were based on 1 year coldstart (no true spinup).

Sep 24 '25 19:09 braczka