time operations where time_bounds span multiple averaging periods
There is an assumption within the functions in climatology.py that the time_bound of data fit concisely within the averaging period applied; this assumption is violated when computing monthly averages, say, on 5-day data. A more appropriate approach would be to compute averaging weights based on the portion of the time_bound that falls within the target averaging period.
One solution for this issue is to interpolate from, say, 5-day data to 1-day data (using 'zero', i.e., piecewise polynomial interpolation), and then to compute monthly averages on daily data. This would be less efficient compared to an approach based on computing weights, but would be more general and easier to implement. The problem, however, is that xarray does not support interpolation over a chunked dimension.
When I try to interpolate a dataset that's read in using open_mfdataset, I get the following:
>>> da.interp(time=new_time_dim).compute()
...
NotImplementedError: Chunking along the dimension to be interpolated (2) is not yet supported.
Eliminating the chunking over time dimension solves this issue, but that would definitely be an infeasible option for practical use.
@alperaltuntas, what if we "unfix time" and use the resample on the float time-axis, then "refix time" to compute the monthly climatology?
@alperaltuntas, what if we "unfix time" and use the resample on the float time-axis, then "refix time" to compute the monthly climatology?
I'll try this.
on second thought, I think resample only works on time axes.
Can't we convert the time axis from cftime to Pandas' accepted time type, instead of unfixing the time?
I think pandas is too restrictive for our data:
When decoding/encoding datetimes for non-standard calendars or for dates before year 1678 or after
year 2262, xarray uses the cftime library. It was previously packaged with the netcdf4-python package
under the name netcdftime but is now distributed separately. cftime is an optional dependency of
xarray.
Have you installed cftime?
On Fri, Feb 15, 2019 at 7:52 AM Matthew Long [email protected] wrote:
I think pandas is too restrictive for our data:
When decoding/encoding datetimes for non-standard calendars or for dates before year 1678 or after year 2262, xarray uses the cftime library. It was previously packaged with the netcdf4-python package under the name netcdftime but is now distributed separately. cftime is an optional dependency of xarray.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NCAR/esmlab/issues/55#issuecomment-464077144, or mute the thread https://github.com/notifications/unsubscribe-auth/AK4fgz1IbCCGWeUMEBSsQ_OxeZZ3l9cjks5vNsmQgaJpZM4ayp56 .
-- Kevin Paul, PhD Project Scientist, Head of I/O & Workflow Applications (IOWA)
The National Center for Atmospheric Research Computational and Information Systems Laboratory 1850 Table Mesa Dr Boulder, CO 80305
Phone: (303) 497-2441 Office: ML460B
Yes. The issue is that cftime doesn’t work with resample and pandas time is too restrictive.
Ah. Got it.
On Fri, Feb 15, 2019 at 8:44 AM Matthew Long [email protected] wrote:
Yes. The issue is that cftime doesn’t work with resample and pandas time is too restrictive.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/NCAR/esmlab/issues/55#issuecomment-464096123, or mute the thread https://github.com/notifications/unsubscribe-auth/AK4fg8bPTUswfky8FP37vMjceYij05VGks5vNtXHgaJpZM4ayp56 .
-- Kevin Paul, PhD Project Scientist, Head of I/O & Workflow Applications (IOWA)
The National Center for Atmospheric Research Computational and Information Systems Laboratory 1850 Table Mesa Dr Boulder, CO 80305
Phone: (303) 497-2441 Office: ML460B
Actually, this issue still applies to compute_mon_climatology. Not sure if it applies to other functions in climatology.py I am planning to update compute_mon_climatology based on the new function that computes means (compute_mon_mean).
Relatedly, I am wondering what's the best way of distinguishing functions that compute climatology vs functions that compute means. I added compute_mon_mean (which computes monthly means, not climatology) to climatology module , but not sure if placing it to climatology module will cause confusion. Also, another potential source of confusion is that the function that computes annual climatology is named compute_ann_mean. Should it be named compute_ann_climatology?
@matt-long ?
Relatedly, I am wondering what's the best way of distinguishing functions that compute climatology vs functions that compute means.
@matt-long, I presume @alperaltuntas's concern would be solved by the nomenclature suggestion you made in our conversation today.
Should we imitate NCL's nomenclature to a certain level : https://www.ncl.ucar.edu/Document/Functions/climo.shtml?
@alperaltuntas,
I added compute_mon_mean (which computes monthly means, not climatology) to climatology module , but not sure if placing it to climatology module will cause confusion.
In #109, I am removing the climatology.py module and most of utility functions in utils will be moved to an EsmlabAccessor class in a new module core.py.
Not sure that it completely solves the confusion issue, I've also moved most functions to the top-level of esmlab.e.g. you can now call esmlab.compute_ann_mean() instead of esmlab.climatology.compute_ann_mean()