[Bug]: temporal.group_average with custom seasons produces wrong result when season cross year boundary for custom seasons.
What happened?
For a season like ["Dec", "Jan", "Feb"], xcdat use Dec value of the year, not of the previous year, at it should for custom seasons
What did you expect to happen? Are there are possible answers you came across?
No response
Minimal Complete Verifiable Example (MVCE)
import numpy as np, xarray as xr, cftime, xcdat as xc
# Creates a monthlhy time axis
nyear = 3
time = []
for ny in np.arange (nyear) :
for nm in np.arange (1,13) :
time.append ( cftime.datetime (year=1900+ny , month=nm , day=15, hour=0, minute=0, second=0, calendar='gregorian', has_year_zero=False) )
time = cftime.date2num (time, units="seconds since 1900-01-01-31 00:00:00.000000", calendar='gregorian', has_year_zero=False, longdouble=False)
time = xr.DataArray ( time, dims=('time',), coords=(time,) )
time.attrs.update ( {
'axis' : "T",
'standard_name': "time",
'long_name' : "Time axis",
'time_origin' : "1900-01-01 00:00:00",
'units' : "seconds since 1900-01-01 00:00:00.000000",
'calendar' : "gregorian" })
# Creates a simple variable
#Var = (np.arange (len(time))%12 + 1).astype(float)
Var = (np.arange (len(time)) + 1).astype(float)
Var = xr.DataArray (Var, dims=('time',), coords=(time,) )
dd = xr.Dataset ( {'Var':Var})# 'time_bnds':time_bnds} )
dd.to_netcdf ( 'toto.nc', mode="a" )
dc = xc.open_dataset ('toto.nc', use_cftime=True, decode_times=True).bounds.add_missing_bounds()
# 'Classical' three months seasonal values are corrects
result_1 = dc.temporal.group_average ( "Var", "season", season_config={"dec_mode": "DJF", "drop_incomplete_djf":False }).Var
# Using custom season : values crossing year frontiers are wrong
custom_seasons = [["Dec", "Jan", "Feb"], ["Mar", "Apr", "May"], ["Jun", "Jul", "Aug"], ["Sep", "Oct", "Nov"]]
#custom_seasons = [["Dec", "Jan", "Feb", "Mar"], ["Apr", "May", "Jun", "Jul"], ["Aug", "Sep", "Oct", "Nov"]]
#custom_seasons = [["Jun", "Jul", "Aug", "Sep"], ["Oct", "Nov", "Dec", "Jan"], ["Feb", "Mar", "Apr", "May"]]
result_2 = dc.temporal.group_average ( "Var", "season", season_config={"custom_seasons":custom_seasons, "drop_incomplete_djf":False} ).Var
print ( result_1.values )
print ( result_2.values )
Relevant log output
Both computation should giuve the asme result, but we get :
[ 1.47457627 4. 7.01086957 10. 12.96666667 16.
19.01086957 22. 24.96666667 28. 31.01086957 34.
36. ]
[ 5.1 4. 7.01086957 10. 17.1 16.
19.01086957 22. 29.1 28. 31.01086957 34. ]
Anything else we need to know?
No response
Environment
xr.show_versions() /Users/marti/mambaforge/envs/FULL/lib/python3.11/site-packages/_distutils_hack/init.py:26: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit: None python: 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 20:51:20) [Clang 16.0.6 ] python-bits: 64 OS: Darwin OS-release: 23.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2
xarray: 2024.3.0 pandas: 2.2.2 numpy: 1.26.4 scipy: 1.13.0 netCDF4: 1.6.5 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.3 nc_time_axis: 1.4.1 iris: 3.8.1 bottleneck: 1.3.8 dask: 2024.4.1 distributed: 2024.4.1 matplotlib: 3.8.3 cartopy: 0.22.0 seaborn: 0.13.2 numbagg: None fsspec: 2024.2.0 cupy: None pint: 0.23 sparse: 0.15.1 flox: None numpy_groupies: None setuptools: 69.1.1 pip: 24.0 conda: None pytest: 8.0.2 mypy: None IPython: 8.22.2 sphinx: 7.2.6
Hey @oliviermarti thanks for opening this GitHub issue.
xCDAT currently does not support custom seasons spanning the calendar year. Another user opened up GitHub Issue #416, which I believe is the same thing as this GitHub issue.
PR #423 is intended to expand the capabilities of custom seasons, including:
- Adding support for seasons that span calendar years
- Detecting and dropping incomplete seasons (not just DJF)
- Removing the requirement for all 12 months to be used for custom seasons