DART icon indicating copy to clipboard operation
DART copied to clipboard

obs_def_land_mod and soil moisture

Open timhoar opened this issue 4 years ago • 3 comments

Use case

The observation converters for soil moisture observations are not consistent. We may need to specify more DART QUANTITYs to cover the situation. Many NASA products use 'volumetric soil moisture percent', CLM uses both 'mm3/mm3' and 'kg/m2', NOAH uses 'm3/m3', so it is really difficult to write a forward operator to return what is expected, and similarly the observation providers use an equally disparate set of units. Conversion between many of these requires knowledge of soil properties.

Depending on what variables people use to include in the DART state, the units of the returned quantity are different. If the CLM history file is used to provide H2OSOI as QTY_SOIL_MOISTURE, the units are mm3/mm3:

	float H2OSOI(time, levsoi, lat, lon) ;
		H2OSOI:long_name = "volumetric soil water (natural vegetated and crop landunits only)" ;
		H2OSOI:units = "mm3/mm3" ;```

If the CLM restart file is used to provide the prognostic variables H2OSOI_LIQ, H2OSOI_ICE, the units are kg/m2:

	double H2OSOI_LIQ(column, levtot) ;
		H2OSOI_LIQ:long_name = "liquid water" ;
		H2OSOI_LIQ:units = "kg/m2" ;

	double H2OSOI_ICE(column, levtot) ;
		H2OSOI_ICE:long_name = "ice lens" ;
		H2OSOI_ICE:units = "kg/m2" ; 

So - if people specify H2OSOI from the history file as the source of QTY_SOIL_MOISTURE, they get soil moistures appropriate for the obs_def_COSMOS_mod.f90 and maybe some other observations. The CLM model_mod currently tries to return H2OSOI if asked for a soil moisture and if it does not exist in the state, the soil moisture returned is the sum of H2OSOI_LIQ and H2OSOI_ICE .. which has units of kg/m2 and is NOT appropriate for the obs_def_COSMOS_mod, but is useful for other soil moisture observations.

Is your feature request related to a problem?

We need to assimilate soil moisture observations - mostly from SMOS, SMAP - satellite observations.

Describe the your prefered solution

I don't have a preferred solution. More DART QUANTITYs? Force the obs_converters and model_interpolate routines to return identical units?

Describe any alternatives you have considered

The CLM model_mod supports different units by including different variables in the DART vector. This is incredibly cryptic and prone to error.

timhoar avatar Jun 29 '21 19:06 timhoar

I have a few more thoughts on this ... I'll use CLM as the example, but it could happen with any land model. The number of soil layers before bedrock may be different in different columns in the same gridcell. There is at least one forward operator that tries to construct a soil moisture profile (obs_def_COSMOS_mod.f90:get_expected_neutron_intensity) that might need to know when to stop (since I am not sure what values are in the bedrock layers). Similarly, the values in the bedrock layers should probably be unaffected by the assimilation. Just as an example, CLM has a variable with 37 levels, 1-12 are for snow layers, 13-37 are for ground:

    0,   // H2OSOI_LIQ(1,3)
    0,   // H2OSOI_LIQ(2,3)
    0,   // H2OSOI_LIQ(3,3)
    0,   // H2OSOI_LIQ(4,3)
    0,   // H2OSOI_LIQ(5,3)
    0,   // H2OSOI_LIQ(6,3)
    0,   // H2OSOI_LIQ(7,3)
    0,   // H2OSOI_LIQ(8,3)
    0,   // H2OSOI_LIQ(9,3)
    0,   // H2OSOI_LIQ(10,3)
    0,   // H2OSOI_LIQ(11,3)
    0,   // H2OSOI_LIQ(12,3)
    1.39366313069961,   // H2OSOI_LIQ(13,3)
    2.78732764093286,   // H2OSOI_LIQ(14,3)
    4.18099330738051,   // H2OSOI_LIQ(15,3)
    5.57465977472289,   // H2OSOI_LIQ(16,3)
    8.36199353061316,   // H2OSOI_LIQ(17,3)
    11.1493317826354,   // H2OSOI_LIQ(18,3)
    13.9366759125884,   // H2OSOI_LIQ(19,3)
    16.7240271846423,   // H2OSOI_LIQ(20,3)
    19.5113866960329,   // H2OSOI_LIQ(21,3)
    22.2987554315511,   // H2OSOI_LIQ(22,3)
    25.0861342140587,   // H2OSOI_LIQ(23,3)
    27.8735237261098,   // H2OSOI_LIQ(24,3)
    30.6609244485058,   // H2OSOI_LIQ(25,3)
    37.6293790577045,   // H2OSOI_LIQ(26,3)
    44.5978616395396,   // H2OSOI_LIQ(27,3)
    51.5663817040275,   // H2OSOI_LIQ(28,3)
    58.5349431838487,   // H2OSOI_LIQ(29,3)
    65.5035491747487,   // H2OSOI_LIQ(30,3)
    72.4722016198826,   // H2OSOI_LIQ(31,3)
    79.4409024248994,   // H2OSOI_LIQ(32,3)
    0.01,   // H2OSOI_LIQ(33,3)
    0.01,   // H2OSOI_LIQ(34,3)
    0.01,   // H2OSOI_LIQ(35,3)
    0.01,   // H2OSOI_LIQ(36,3)
    0.01,  // H2OSOI_LIQ(37,3)

The value of 0.01 is neither the _FillValuie or missing_value. To my knowledge there is no metadata array in either the restart file or the history file that contains the information about the number of soil layers 'in use'.

As a stopgap measure ... the forward operator requiring the soil moisture profile only needs the soil moisture for the first few meters - and even that is probably overkill - the cosmic ray probes are really only sensitive to the top 10 cm or so.

timhoar avatar Jul 01 '21 13:07 timhoar

I had a chat with Bill Sacks asking if there was some array to indicate what is bedrock, what is not,

Take a look at the LEVGRND_CLASS variable. It's not very explicit, but I think it will give you what you want.

As defined in initVerticalMod:

! The distinction between "shallow" and "deep" bedrock is not made explicitly
! elsewhere. But, since these classes have somewhat different behavior, they are
! distinguished explicitly here.
integer, parameter :: LEVGRND_CLASS_STANDARD        = 1
integer, parameter :: LEVGRND_CLASS_DEEP_BEDROCK    = 2
integer, parameter :: LEVGRND_CLASS_SHALLOW_BEDROCK = 3

The logic for setting this is here:

https://github.com/escomp/CTSM/blob/master/src/main/initVerticalMod.F90#L624-L649

So for column 3 to match the example above:

    1,   // LEVGRND_CLASS(1,3)
    1,   // LEVGRND_CLASS(2,3)
    1,   // LEVGRND_CLASS(3,3)
    1,   // LEVGRND_CLASS(4,3)
    1,   // LEVGRND_CLASS(5,3)
    1,   // LEVGRND_CLASS(6,3)
    1,   // LEVGRND_CLASS(7,3)
    1,   // LEVGRND_CLASS(8,3)
    1,   // LEVGRND_CLASS(9,3)
    1,   // LEVGRND_CLASS(10,3)
    1,   // LEVGRND_CLASS(11,3)
    1,   // LEVGRND_CLASS(12,3)
    1,   // LEVGRND_CLASS(13,3)
    1,   // LEVGRND_CLASS(14,3)
    1,   // LEVGRND_CLASS(15,3)
    1,   // LEVGRND_CLASS(16,3)
    1,   // LEVGRND_CLASS(17,3)
    1,   // LEVGRND_CLASS(18,3)
    1,   // LEVGRND_CLASS(19,3)
    1,   // LEVGRND_CLASS(20,3)
    2,   // LEVGRND_CLASS(21,3)
    2,   // LEVGRND_CLASS(22,3)
    2,   // LEVGRND_CLASS(23,3)
    2,   // LEVGRND_CLASS(24,3)
    2,  // LEVGRND_CLASS(25,3)

timhoar avatar Jul 01 '21 17:07 timhoar

I created a test branch (clm_snow_layer_ignore) to to skip updating snow layers. The only modification is in the dart_to_clm program - I added a routine Ignore_Snow_Layers().

Updating the soil moisture or temperature is throwing 'soil balance error'. The observations are of SOIL_MOISTURE, MODIS_SNOWCOVER_FRAC, MODIS_LEAF_AREA_INDEX, and BIOMASS, the only soil variables being updated are H2OSOI_LIQ, H2OSOI_ICE and T_SOISNO. I believe get_close_state() is declaring all columns that are not vegetated_or_bare_soil or crop to be infinitely far away.

Since the error seems to come immediately after: WARNING: BalanceCheck: soil balance error (W/m2) and then some NaNs are generated ... I decided to leave anything in the snow layers alone. This change got past the soil balance error and died in: urban net longwave radiation error: no convergenc Despite the fact I believe I am skipping all urban columns.

Erik Kluzek commented "Since you are triggering an urban issue you must be modifying something outside of what you intend to, based on your description of what you are doing." I have the model_mod.f90:related() function enabled such that everything in landunits or columns that are NOT 'vegetated_or_bare_soil' or 'crop' are set to be outside the localization radius, so I should not be modifying anything in 'urban' columns or landunits.

timhoar avatar Jul 01 '21 18:07 timhoar