xcdat icon indicating copy to clipboard operation
xcdat copied to clipboard

[Bug]: Writing out decoded time coordinates with non-CF time units breaks

Open tomvothecoder opened this issue 1 year ago • 0 comments

What happened?

When xarray writes datasets to a file, it uses cftime.date2num() which has restrictions on which units and calendars are compatible.

units: a string of the form

Example:

Let's say we open up a dataset with the following metadata for time coordinates:

  • units = "months since 1800"
  • calendar = "standard"

xcdat is able to decode these time coordinates, but xarray cannot write the dataset to a file

Writing out time coordinates with "months since ..."

Result:

File ~/miniconda3/envs/xcdat_dev/lib/python3.10/site-packages/xarray/coding/times.py:684, in CFDatetimeCoder.encode(self, variable, name)
...
File src/cftime/_cftime.pyx:245, in cftime._cftime.date2num()

File src/cftime/_cftime.pyx:98, in cftime._cftime._dateparse()

ValueError: 'months since' units only allowed for '360_day' calendar

Writing out time coordinates with "years since ..."

Result:

File ~/miniconda3/envs/xcdat_dev/lib/python3.10/site-packages/xarray/coding/times.py:684, in CFDatetimeCoder.encode(self, variable, name)
...
File src/cftime/_cftime.pyx:245, in cftime._cftime.date2num()

File src/cftime/_cftime.pyx:102, in cftime._cftime._dateparse()

ValueError: In general, units must be one of 'microseconds', 'milliseconds', 'seconds', 'minutes', 'hours', or 'days' (or select abbreviated versions of these).  For the '360_day' calendar, 'months' can also be used, or for the 'noleap' calendar 'common_years' can also be used. Got 'years' instead, which are not recognized.

What did you expect to happen?

Since xcdat supports decoding non-CF time coordinates, we should be able to write them back to a file.

Minimal Complete Verifiable Example

import xcdat as xc
from tests.fixtures import generate_dataset

# Case 1 -- "months since ..."
# ---------------------------
ds1 = generate_dataset(decode_times=True, cf_compliant=False, has_bounds=True)

print(ds1.time.encoding) 
# {'calendar': 'standard', 'units': 'months since 2000-01-01'}

ds1.to_netcdf("qa/issue-396-decode-times/case2-non-cf-time.nc")
# ValueError: 'months since' units only allowed for '360_day' calendar

# Case 2 -- "years since ..."
# ---------------------------
ds2 = generate_dataset(decode_times=True, cf_compliant=False, has_bounds=True)

ds2.time.encoding["units"] = "years since 2000-01-01"
print(ds2.time.encoding) 
# {'calendar': 'standard', 'units': 'years since 2000-01-01'}

ds2.to_netcdf("qa/issue-396-decode-times/case2-non-cf-time.nc")
# ValueError: In general, units must be one of 'microseconds', 'milliseconds', 'seconds', 
# 'minutes', 'hours', or 'days' (or select abbreviated versions of these).  For the '360_day'
# calendar, 'months' can also be used, or for the 'noleap' calendar 'common_years' can also be 
# used. Got 'years' instead, which are not recognized.

Relevant log output

No response

Anything else we need to know?

Related issues

  • https://github.com/Unidata/cftime/issues/68
  • https://github.com/pydata/xarray/issues/1467

Potential Solutions

  • https://stackoverflow.com/questions/64624689/xarray-error-when-decoding-netcdf-data-with-time-units-of-years-since

Environment

Latest main and xcdat=0.4.0

tomvothecoder avatar Jan 05 '23 19:01 tomvothecoder