cosima-cookbook icon indicating copy to clipboard operation
cosima-cookbook copied to clipboard

Select a time instance from a data array is complicated

Open navidcy opened this issue 5 years ago • 1 comments

Changing the xlim from a time series plotted using cc is not trivial at all. Have a look at this: https://gist.github.com/navidcy/5375e98d4ac93185cb32c8edb76db60b

Could we implement something in cc to simplify the way one can slice a time-series or pick a particular instance in time?

navidcy avatar Nov 11 '19 05:11 navidcy

I think there are a few unintuitive things going on. If we don't restrict the loaded data with the start_time= and end_time= keywords, we have the usual:

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-19.10/lib/python3.6/
    site-packages/xarray/coding/times.py:459:
SerializationWarning: Unable to decode time axis into full numpy.datetime64 objects,
continuing using cftime.datetime objects instead, reason: dates out of range

So we get a time DataArray that looks like

<xarray.DataArray 'time' (time: 8580)>
array([cftime.DatetimeNoLeap(1900, 1, 16, 12, 0, 0, 0, 4, 16),
       cftime.DatetimeNoLeap(1900, 2, 15, 0, 0, 0, 0, 6, 46),
       cftime.DatetimeNoLeap(1900, 3, 16, 12, 0, 0, 0, 0, 75), ...

I think that it makes more sense to try to select the relevant time range first, rather than with xlim. We can select all values from a single year with .sel(time='2300'), a single month with .sel(time='2300-01'), or a single day with .sel(time='2300-01-16'). It doesn't seem that we can pass method='nearest', such as .sel(time='2300-01-01', method='nearest') (returns an empty result).

In the case of this experiment, we actually do have a non-monotonic time index for some reason, jumping from 2009 back to 2005:

In [92]: temp.time.isel(time=slice(1315, 1325))
Out[92]: 
<xarray.DataArray 'time' (time: 10)>
array([cftime.DatetimeNoLeap(2009, 8, 16, 12, 0, 0, 0, 3, 228),
       cftime.DatetimeNoLeap(2009, 9, 16, 0, 0, 0, 0, 6, 259),
       cftime.DatetimeNoLeap(2009, 10, 16, 12, 0, 0, 0, 1, 289),
       cftime.DatetimeNoLeap(2009, 11, 16, 0, 0, 0, 0, 4, 320),
       cftime.DatetimeNoLeap(2009, 12, 16, 12, 0, 0, 0, 6, 350),
       cftime.DatetimeNoLeap(2005, 1, 16, 12, 0, 0, 0, 4, 16),
       cftime.DatetimeNoLeap(2005, 2, 15, 0, 0, 0, 0, 6, 46),
       cftime.DatetimeNoLeap(2005, 3, 16, 12, 0, 0, 0, 0, 75),
       cftime.DatetimeNoLeap(2005, 4, 16, 0, 0, 0, 0, 3, 106),
       cftime.DatetimeNoLeap(2005, 5, 16, 12, 0, 0, 0, 5, 136)], dtype=object)

If we skip this anomaly, then we can do a time range selection as expected:

In [93]: temp.isel(time=slice(1400, None)).sel(time=slice('2300', '2400'))
Out[93]:
<xarray.DataArray 'surface_temp' (time: 1212, yt_ocean: 300, xt_ocean: 360)>
dask.array<getitem, shape=(1212, 300, 360), dtype=float32, chunksize=(1, 300, 360), chunktype=numpy.ndarray>
Coordinates:
  * xt_ocean  (xt_ocean) float64 -279.5 -278.5 -277.5 -276.5 ... 77.5 78.5 79.5
  * yt_ocean  (yt_ocean) float64 -77.88 -77.63 -77.38 ... 88.87 89.32 89.77
  * time      (time) object 2300-01-16 12:00:00 ... 2400-12-16 12:00:00

I think this feels cleaner than trying to change the xlim after the fact (it gets a bit confusing with the different datetime types, etc.). It may be more worrying that the time data is non-monotonic from the query on this experiment...

angus-g avatar Nov 11 '19 23:11 angus-g