esmlab icon indicating copy to clipboard operation
esmlab copied to clipboard

time operations where time_bounds span multiple averaging periods

Open matt-long opened this issue 6 years ago • 13 comments

There is an assumption within the functions in climatology.py that the time_bound of data fit concisely within the averaging period applied; this assumption is violated when computing monthly averages, say, on 5-day data. A more appropriate approach would be to compute averaging weights based on the portion of the time_bound that falls within the target averaging period.

matt-long avatar Feb 09 '19 17:02 matt-long

One solution for this issue is to interpolate from, say, 5-day data to 1-day data (using 'zero', i.e., piecewise polynomial interpolation), and then to compute monthly averages on daily data. This would be less efficient compared to an approach based on computing weights, but would be more general and easier to implement. The problem, however, is that xarray does not support interpolation over a chunked dimension.

When I try to interpolate a dataset that's read in using open_mfdataset, I get the following:

>>> da.interp(time=new_time_dim).compute()
...
NotImplementedError: Chunking along the dimension to be interpolated (2) is not yet supported.

Eliminating the chunking over time dimension solves this issue, but that would definitely be an infeasible option for practical use.

alperaltuntas avatar Feb 14 '19 02:02 alperaltuntas

@alperaltuntas, what if we "unfix time" and use the resample on the float time-axis, then "refix time" to compute the monthly climatology?

matt-long avatar Feb 14 '19 23:02 matt-long

@alperaltuntas, what if we "unfix time" and use the resample on the float time-axis, then "refix time" to compute the monthly climatology?

I'll try this.

alperaltuntas avatar Feb 14 '19 23:02 alperaltuntas

on second thought, I think resample only works on time axes.

matt-long avatar Feb 14 '19 23:02 matt-long

Can't we convert the time axis from cftime to Pandas' accepted time type, instead of unfixing the time?

alperaltuntas avatar Feb 15 '19 00:02 alperaltuntas

I think pandas is too restrictive for our data:

When decoding/encoding datetimes for non-standard calendars or for dates before year 1678 or after 
year 2262, xarray uses the cftime library. It was previously packaged with the netcdf4-python package 
under the name netcdftime but is now distributed separately. cftime is an optional dependency of 
xarray.

matt-long avatar Feb 15 '19 14:02 matt-long

Have you installed cftime?

On Fri, Feb 15, 2019 at 7:52 AM Matthew Long [email protected] wrote:

I think pandas is too restrictive for our data:

When decoding/encoding datetimes for non-standard calendars or for dates before year 1678 or after year 2262, xarray uses the cftime library. It was previously packaged with the netcdf4-python package under the name netcdftime but is now distributed separately. cftime is an optional dependency of xarray.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NCAR/esmlab/issues/55#issuecomment-464077144, or mute the thread https://github.com/notifications/unsubscribe-auth/AK4fgz1IbCCGWeUMEBSsQ_OxeZZ3l9cjks5vNsmQgaJpZM4ayp56 .

-- Kevin Paul, PhD Project Scientist, Head of I/O & Workflow Applications (IOWA)

The National Center for Atmospheric Research Computational and Information Systems Laboratory 1850 Table Mesa Dr Boulder, CO 80305

Phone: (303) 497-2441 Office: ML460B

kmpaul avatar Feb 15 '19 15:02 kmpaul

Yes. The issue is that cftime doesn’t work with resample and pandas time is too restrictive.

matt-long avatar Feb 15 '19 15:02 matt-long

Ah. Got it.

On Fri, Feb 15, 2019 at 8:44 AM Matthew Long [email protected] wrote:

Yes. The issue is that cftime doesn’t work with resample and pandas time is too restrictive.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/NCAR/esmlab/issues/55#issuecomment-464096123, or mute the thread https://github.com/notifications/unsubscribe-auth/AK4fg8bPTUswfky8FP37vMjceYij05VGks5vNtXHgaJpZM4ayp56 .

-- Kevin Paul, PhD Project Scientist, Head of I/O & Workflow Applications (IOWA)

The National Center for Atmospheric Research Computational and Information Systems Laboratory 1850 Table Mesa Dr Boulder, CO 80305

Phone: (303) 497-2441 Office: ML460B

kmpaul avatar Feb 15 '19 15:02 kmpaul

Actually, this issue still applies to compute_mon_climatology. Not sure if it applies to other functions in climatology.py I am planning to update compute_mon_climatology based on the new function that computes means (compute_mon_mean).

Relatedly, I am wondering what's the best way of distinguishing functions that compute climatology vs functions that compute means. I added compute_mon_mean (which computes monthly means, not climatology) to climatology module , but not sure if placing it to climatology module will cause confusion. Also, another potential source of confusion is that the function that computes annual climatology is named compute_ann_mean. Should it be named compute_ann_climatology?

@matt-long ?

alperaltuntas avatar Apr 03 '19 20:04 alperaltuntas

Relatedly, I am wondering what's the best way of distinguishing functions that compute climatology vs functions that compute means.

@matt-long, I presume @alperaltuntas's concern would be solved by the nomenclature suggestion you made in our conversation today.

andersy005 avatar Apr 03 '19 23:04 andersy005

Should we imitate NCL's nomenclature to a certain level : https://www.ncl.ucar.edu/Document/Functions/climo.shtml?

andersy005 avatar Apr 03 '19 23:04 andersy005

@alperaltuntas,

I added compute_mon_mean (which computes monthly means, not climatology) to climatology module , but not sure if placing it to climatology module will cause confusion.

In #109, I am removing the climatology.py module and most of utility functions in utils will be moved to an EsmlabAccessor class in a new module core.py.

Not sure that it completely solves the confusion issue, I've also moved most functions to the top-level of esmlab.e.g. you can now call esmlab.compute_ann_mean() instead of esmlab.climatology.compute_ann_mean()

andersy005 avatar Apr 04 '19 04:04 andersy005