xcdat icon indicating copy to clipboard operation
xcdat copied to clipboard

Add temporal bounds and center times for `group_average()` API

Open tomvothecoder opened this issue 1 year ago • 3 comments

Description

  • Closes #565

Checklist

  • [ ] My code follows the style guidelines of this project
  • [ ] I have performed a self-review of my own code
  • [ ] My changes generate no new warnings
  • [ ] Any dependent changes have been merged and published in downstream modules

If applicable:

  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] New and existing unit tests pass with my changes (locally and CI/CD build)
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

tomvothecoder avatar Nov 22 '24 22:11 tomvothecoder

@pochedls and @oliviermarti this PR should address this GH issue (same as this comment from @oliviermarti).

If you can check this branch out and try it that'd be great.

import numpy as np
import pandas as pd
import xcdat as xc

# Create a dummy xarray dataset
time = pd.date_range("2000-01-01", "2001-12-31", freq="D")
data = np.random.rand(len(time))
dummy_ds = xr.Dataset({"dummy_var": (["time"], data)}, coords={"time": time})
dummy_ds["time"].encoding["calendar"] = "standard"
dummy_ds = dummy_ds.bounds.add_missing_bounds(axes=["T"])

ds_avg = dummy_ds.temporal.group_average("dummy_var", freq="month")

Before -- no time_bnds and time starts at the beginning of the averaged period

ds_avg.time

<xarray.DataArray 'time' (time: 24)> Size: 192B
array([cftime.DatetimeGregorian(2000, 1, 1, 0, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2000, 2, 1, 0, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2000, 3, 1, 0, 0, 0, 0, has_year_zero=False),
		...
      dtype=object)
Coordinates:
  * time     (time) object 192B 2000-01-01 00:00:00 ... 2001-12-01 00:00:00
Attributes:
    bounds:   time_bnds

Result -- time is now centered using time_bnds

ds_avg.time

array([cftime.DatetimeGregorian(2000, 1, 16, 12, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2000, 2, 15, 12, 0, 0, 0, has_year_zero=False),
       cftime.DatetimeGregorian(2000, 3, 16, 12, 0, 0, 0, has_year_zero=False),
		...
      dtype=object)
ds_avg.time_bnds

array([[cftime.DatetimeGregorian(2000, 1, 1, 0, 0, 0, 0, has_year_zero=False),
        cftime.DatetimeGregorian(2000, 2, 1, 0, 0, 0, 0, has_year_zero=False)],
       [cftime.DatetimeGregorian(2000, 2, 1, 0, 0, 0, 0, has_year_zero=False),
        cftime.DatetimeGregorian(2000, 3, 1, 0, 0, 0, 0, has_year_zero=False)],
       [cftime.DatetimeGregorian(2000, 3, 1, 0, 0, 0, 0, has_year_zero=False),
        cftime.DatetimeGregorian(2000, 4, 1, 0, 0, 0, 0, has_year_zero=False)],
		...
      dtype=object)

tomvothecoder avatar Nov 22 '24 22:11 tomvothecoder

@tomvothecoder – this is great – thanks for pushing this forward so quickly.

I think add_missing_bounds will work in most cases, but will fail for seasonal averages (and definitely custom seasons).

I think we'll need to collect the bounds for each group, (e.g., group_bounds_array = [("2000-01-01 00:00", "2000-01-02 00:00"), ("2000-01-02 00:00", "2000-01-03 00:00"), ..., ("2000-01-31 00:00", "2000-02-01 00:00")] and then take the min of the lower bound and the max of the upper bound (i.e., group_bnd = [np.min(groups_bound_array[:, 0]), np.max(group_bounds_array[:, 1])].

pochedls avatar Nov 22 '24 22:11 pochedls

I think we'll need to collect the bounds for each group, (e.g., group_bounds_array = [("2000-01-01 00:00", "2000-01-02 00:00"), ("2000-01-02 00:00", "2000-01-03 00:00"), ..., ("2000-01-31 00:00", "2000-02-01 00:00")] and then take the min of the lower bound and the max of the upper bound (i.e., group_bnd = [np.min(groups_bound_array[:, 0]), np.max(group_bounds_array[:, 1])]

This makes sense to me. I'll think of an algorithm.

tomvothecoder avatar Dec 06 '24 23:12 tomvothecoder