pecan icon indicating copy to clipboard operation
pecan copied to clipboard

mismatch between ED run period and outputs

Open istfer opened this issue 7 years ago • 12 comments

a question for ED users - I'm giving a run from 2005/01/01 to 2006/12/31 and I'm having a couple of issues with the outputs:

  1. It writes out an analysis-T-2004-00-00-000000-g01.h5 file which has 11 non-zero values in it.
  2. The logfile says it starts the sim from 01/01/2005 00:00:00 UTC and the last date printed out is 12/30/2006 00:00:00 UTC so it misses the whole 12/31/2006 at the end
  3. analysis-T-2006-00-00-000000-g01.h5 file has 17461 non-zero values in it whereas it's supposed to have 17520
  4. This 17461 value is consistent with the missing day (24*2 = 48) plus the 11 value in the 2004 file (17461 + 48 +11 = 17520)
  5. 2005.nc file post-processed by pecan starts with values in the analysis-T-2005-00-00-000000-g01.h5 file, ignoring the 11 values in the 2004 file
  6. Finally, 2006.nc file post-processed by pecan also ends with 59 zero values.

Btw, all analysis-T-YYYY-00-00-000000-g01.h5 files have doy*24*2 values regardless how many actual values were computed for that year. ED just fills the rest with zeros.

These are the values in ED2IN:

   NL%IMONTHA  = 01
   NL%IDATEA   = 01
   NL%IYEARA   = 2005 
   NL%ITIMEA   = 0000
   NL%IMONTHZ  = 12
   NL%IDATEZ   = 31
   NL%IYEARZ   = 2006
   NL%ITIMEZ   = 0000

The site I'm running is Bartlett, UTC-05:00 (EST) and UTC-04:00 (EDT). So I could understand why it produces values for 2004, and why there could be 10 values (5*2 = 10)..but what's up with 11 values? does it have something to do with fortran indices? or changing daylight saving time?

Finally, how can I resolve this issue? (I tried giving runs from 01/01/2005 0500 to 01/01/2007 0500, it still produces 1 value at 2004, and at the end, in 2006 it skips the whole December and complains that there is no 2007JAN.h5)

istfer avatar Dec 18 '17 01:12 istfer

correct me if I'm wrong, but the only adjustment for timezone issues is in met2model.ED. And it just adds buffer in front of the time-series which results in losing data points at the end of the original data time-series. Not sure if that makes sense

write.configs only writes day month and year, there is no tag for manipulating ITIMEA and ITIMEZ, they're 0000 in the templates

model2netcdf.ED just processes whatever time-period is requested

istfer avatar Dec 18 '17 17:12 istfer

@istfer I have also noticed for my runs that an analysis-T-2005-00-00-000000-g01.h5 files gets made, even when I'm starting from 2006. (At Duke Forest).

mccabete avatar Dec 18 '17 17:12 mccabete

I checked my "extra" analysis-T-2005 file.

GROUP "/" {
   DATASET "BASEFLOW" {
      DATATYPE  H5T_IEEE_F32LE
      DATASPACE  SIMPLE { ( 17520, 1 ) / ( 17520, 1 ) }
      DATA {
      (0,0): 3.52222e-08,
      (1,0): 3.5223e-08,
      (2,0): 3.52239e-08,
      (3,0): 3.52248e-08,
      (4,0): 3.52257e-08,
      (5,0): 3.52265e-08,
      (6,0): 3.52274e-08,
      (7,0): 3.52283e-08,
      (8,0): 3.52292e-08,
      (9,0): 3.523e-08,
      (10,0): 3.52309e-08,
      (11,0): 0,
      (12,0): 0,
      (13,0): 0,
      (14,0): 0,
      (15,0): 0,

I found similar numbers to you Istem, but one off. (Ie, I have 10 non-zero values, and 17519 total)

mccabete avatar Dec 18 '17 17:12 mccabete

thanks @mccabete

hmm where to address this then

I feel like it would be the easiest to handle it in the post processing, rather than adding buffer to the met driver and shift everything awkwardly for timezone match

we can let ED do whatever it's doing, in the model2netcdf we can look for different analysis-T-YYYY files if the site is not in UTC zone, read from them and concatenate?

istfer avatar Dec 18 '17 17:12 istfer

different flavors of what happens to the model-data comparison in the 2006 part of one of my 2005/06/01 - 2006/12/31 runs (red : data, blue : model output read by pecan workflow, black : model output read and aligned manually)

plot_zoom_png

to make things more complicated this is only visible in the second year if you're starting the run from the middle of the year as in my 2005/06/01 - 2006/12/31 example. If you're starting your runs from Jan 1st, mismatch appears in the first year as well

istfer avatar Dec 18 '17 19:12 istfer

I could be wrong, but I think @istfer's issue is a more simple fix/issue: ED runs stop as soon as the end date is reached -- i.e. end date is not included in the runs. For complete ED runs, I'm pretty sure you'll need to do actual end date + 1. So you'll need your end date as 2007/01/01

It won't matter what met goes to 2007/01/01 since it never gets run. (Default will be to recycle the met)

crollinson avatar Jan 04 '18 17:01 crollinson

I tried giving different start/end dates, it complained that there is no 2007JAN.h5 but I guess this is solvable by changing a flag in ED2IN (so that it recycles the met, but doesn't run 2007/01/01)

However, the real issue is that it shifts the outputs according to timezone (e.g. writes the first few data points of a 2005 run to a 2004.h5 file which pecan doesn't process, and shifts the rest accordingly such that last data points in 2006 are just zeros -in addition to zeros due to not running the last day issue-). Still, it would be nice if it can be handled on the write.configs end by shifting start/end times accordingly

istfer avatar Jan 04 '18 18:01 istfer

back to this. @mdietze what do you think?

So, there was an issue due to the hardcoded -21600.0d0s in ED2's timestamps (also see issue #231)

once we get rid of that, there remains some other issues to be resolved. My example here is a 2005-2006 Bartlett (lst-5) run, tower files.

  • simulation of 2005 starts from
2005           1           1        3000

ends at

2005          12          31      233000

As a result 2005 has 17519 non-zero values and a padding zero at the end. This could be resolved by calling h5_output for OPTI in the beginning of the loop in the ed_model, like it's being done for INST. This way 2005 simulation will start from 2005 1 1 0 and zero will be at the beginning of 2005 output instead of the end.

  • simulation of 2006 starts from
2006           1           1           0

ends at

2006          12          31           0

ED2 stops as soon as it sees the end time given in its ED2IN which is in this case:

   NL%IMONTHZ = 12
   NL%IDATEZ = 31
   NL%IYEARZ = 2006
   NL%ITIMEZ = 0

as written by pecan workflow. As a result 2006 has 17473 non-zero values with 47 padding zeros. This is not a big deal, but the runtime can be extended if we change NL%ITIMEZ to 2330 or something like that

  • the real issue is that, now we shift everything in met-process for lst-5 and add a buffer in front (also missing data points at the end), but we don't do any correction in the model2netcdf.ED2

which results in the following comparison with the data: plot_zoom_png

Once I manually add the padding zero at the end of the 2005 to its beginning and correct for the timezone, I get the right match: plot_zoom_png2

So, model2netcdf.ED2 needs to correct for lst as well. This also means, the way we process it now needs to change, i.e. we can't just loop over the years one by one, some of 2006-T- values belong to 2005-T- (they need to appear in 2005.nc not in 2006.nc from pecan's perspective)

The other alternative is to change ED2 code to assume UTC0 everywhere (such that no shift is needed in pre/post process), but @mdietze found this to be more of a hack rather than a fix

istfer avatar Jun 05 '18 13:06 istfer

after today's meeting, adjusting for timezone in model2netcdf.ED2 might not be the way to go..rather, load_data will bring everything to UTC? just tagging @bcow here

istfer avatar Jun 06 '18 15:06 istfer

I think the met2model.ED2 code needs a fix, but checking here first if it makes sense. @istfer @crollinson

Backstory: New visiting student Xiaowu is trying to run her site with ED2. She has created a valid NC file. However, met2model.ED2 fails on a bad “toff” value.

I looked at the files, which look good. However, I think with a positive LST offset (8, for China), then the “toff” variable in met2model.ED2 becomes negative, which breaks the code for padding time in front of the data. This would only work if LST is negative.

met2model.ED2 has these lines (from line 145)

toff <- -as.numeric(lst) * 3600 / dt

## buffer to get to GMT
slen <- seq_along(SW)
Tair <- c(rep(Tair[1], toff), Tair)[slen]

A negative toff is not cool!

ankurdesai avatar Nov 03 '18 04:11 ankurdesai

I think that's a pretty old bit of code, looks like noone has used ED through pecan in the Eastern Hemisphere?

My understanding was that in pecan we decided to have things in UTC before met2model and after model2netcdf last time I checked.

But I don't think it's in effect currently (i.e. data are not in UTC), and afaik most of the models do assume local time. So we still need this conversion in met2model.ED2 (although I still find it quirky that ED runs things in UTC time).

Overall, a negative toff looks like a bug to me, I guess we need some fix like:

if(toff>0){
  Tair <- c(rep(Tair[1], toff), Tair)[slen]
}else{
  # Tair <- Tair[(-1:toff)]
  # or if we want the full length 
  Tair <- c(Tair[(-1:toff)], rep(Tair[length(Tair)], -toff))[slen]
}

istfer avatar Nov 03 '18 17:11 istfer

This issue is stale because it has been open 365 days with no activity.

github-actions[bot] avatar Apr 16 '20 00:04 github-actions[bot]