DART icon indicating copy to clipboard operation
DART copied to clipboard

filter.html Detailed Program Execution outline neglects the 'analysis' stage.

Open kdraeder opened this issue 3 years ago • 7 comments

Describe the bug

The outline skips from stage 'postassim' to 'output'. I don't know what belongs between 'analysis' and 'output', i.e. what the difference between them could be.

A separate, minor clarification; I think the items about writing stages should say that they write to the files, instead of writing the files, since that implies writing the whole thing at once, which can't happen before all obs have been processed.

Which model(s) are you working with?

filter

###Screenshots
If applicable, add screenshots to help explain your problem.

Version of DART

Which version of DART are you using? v9.10.1-68-ge3644267f

Have you modified the DART code?

No

kdraeder avatar May 21 '21 14:05 kdraeder

if it doesn't, the docs need to distinguish between writing multiple files, one per ensemble member, and the combination file format ("single file") where all the members are written to a single file and multiple timesteps in the files are possible. for the combination "single file" format, output always contains the last time step, while analysis can contain a diagnostic time series.

output is not automatically written since you might not want it if running PMO, for example, or running with all evaluate-only observations.

for the multi-file case we currently can't loop inside filter over multiple time steps, so the contents of output may always be the same as analysis, but for consistency with other modes all the stage options are there.

nancycollins avatar May 21 '21 14:05 nancycollins

Just to clarify a bit here Kevin, is this a problem with the documentation? i.e. the outline in the documentation is wrong.

hkershaw-brown avatar May 21 '21 14:05 hkershaw-brown

@hkershaw-brown I think the outline in the documentation is incomplete. I know it can't include everything, but if it includes the writing of some stages, I think it should include the writing of all stages.

I also just noticed that it doesn't include the writing of 'forecast'.

@nancycollins Thanks for the clarification and reminding me of the single file case.

kdraeder avatar Jun 01 '21 19:06 kdraeder

related to https://github.com/NCAR/DART/issues/90 ?

hkershaw-brown avatar Jun 04 '21 17:06 hkershaw-brown

#90 (converted to discussion #331) provides context for this, but also has a lot of tangential discussion. I'm hoping this issue can focus on the contents of the "Detailed program execution flow" in filter.rst (was filter.html). Here's the text, with my comments about what I think is missing. The state vectors of pairs of stages can be identical, if filter doesn't do anything between them. It's up to the user to know that and choose stages to write accordingly. There are variations of this; write_all_stages_at_end, {input,output}_state_file_list, single_file_{in,out} but this program flow ignores those.

  • Read in observations.
  • Read in state vectors from model netcdf restart files.
  • Initialize inflation fields, possibly reading netcdf restart files.
  • If requested, initialize and write to "input" netcdf diagnostic files.
  • Trim off any observations if start/stop times specified.
  • Begin main assimilation loop:
    • Check model time vs observation times:

      • If current assimilation window is earlier than model time, error.
      • If current assimilation window includes model time, begin assimilating.
      • If current assimilation window is later than model time, advance model:
        • Write out current state vectors for all ensemble members (a second place where "input" files might be written ? ).
        • Advance the model by subroutine call or by shell script:
          • Tell the model to run up to the requested time.
        • Read in new state vectors from netcdf files for all ensemble members.
        • If requested, write out "forecast" files (members, mean, spread).
    • Apply prior inflation if requested.

    • Compute ensemble of prior observation values with forward operators.

    • If requested, compute and write the "preassim" netcdf diagnostic files. This is AFTER any prior inflation has been applied.

    • Compute prior observation space diagnostics.

    • Assimilate all observations in this window:

      • Get all obs locations and kinds.
      • Get all state vector locations and kinds.
      • For each observation:
        • Compute the observation increments.
        • Find all other obs and states within localization radius.
        • Compute the covariance between obs and state variables.
        • Apply increments to state variables weighted by correlation values.
        • Apply increments to any remaining unassimilated observations.
        • Loop until all observations in window processed.
    • If requested, compute and write the "postassim" netcdf diagnostic files (members, mean, spread). This is BEFORE any posterior inflation has been applied.

    • Apply posterior inflation if requested.

    • Compute ensemble of posterior observation values with forward operators.

    • Compute posterior observation space diagnostics.

    • If requested, compute and write out the ~~"output"~~ "analysis" netcdf diagnostic files (members, mean, spread). This is AFTER any posterior inflation has been applied.

    • Loop until all observations in input file processed.

  • Close diagnostic files.
  • Write out final observation sequence file.
  • Write out inflation restart files if requested.
  • Write out ~~final~~ "output" state vectors to model restart files if requested.
  • Release memory for state vector and observation ensemble members.

@nancycollins wrote (above) that "analysis" can contain a time series of vectors (if the model advances can be run within the filter loop), while "output" has just the last time.
Is that also true for "forecast", "preassim", and "postassim"? In CAM assimilations the "output" vectors are written into existing initial files ("restart files" in this doc), which have additional variables in them, so the "output" and "analysis" files would look different, but the state vectors would be the same.

kdraeder avatar Mar 23 '23 00:03 kdraeder

fixing as part of bigger documentation fix, leaving as is for now.

hkershaw-brown avatar May 28 '24 13:05 hkershaw-brown

reopening,

see Kevin's changes: https://github.com/NCAR/DART/pull/677

and notes on the (many) problems with the filter.rst page https://github.com/NCAR/DART/pull/677#pullrequestreview-2057931199

hkershaw-brown avatar Aug 09 '24 18:08 hkershaw-brown