xMIP icon indicating copy to clipboard operation
xMIP copied to clipboard

Concatenate experiments with different calendars

Open znicholls opened this issue 3 years ago • 6 comments

This might be out of scope for this package (feel free to close if it is), but sometimes you get experiments with different calendars (e.g. historical uses cftime.DatetimeNoLeap and future uses cftime.DatetimeJulian). If you concatenate these using concat_experiments then you get a complete dataset, but it then explodes if you try to plot. Minimal example to reproduce is below

>>> url = "https://cmip6-pds.s3.amazonaws.com/pangeo-cmip6.json"
>>> col = intake.open_esm_datastore(url)
>>> subset = col.search(
...     variable_id="co2mass",
...     experiment_id=["historical", "ssp370SST-lowNTCF"],
...     source_id="GFDL-ESM4"
... )
>>> datasets = subset.to_dataset_dict(
...     aggregate=False,
...     zarr_kwargs={"consolidated": True, "decode_times": True, "use_cftime": True},
...     progressbar=True,
... )
>>> out = concat_experiments(datasets)
>>> out["GFDL-ESM4.gn.Amon.r1i1p1f1.co2mass"]["co2mass"].plot()
....
ValueError: Calendar units are not all equal.

I'm not sure what the rules with calendars are, but in some cases (e.g. working with monthly data) a simple fix is to just re-write everything in the same calendar by force e.g. ds.assign_coords(time=[cftime.DatetimeGregorian(*v.timetuple()) for v in ds.time.values]). Not sure if such a simple fix belongs here or not thought.

znicholls avatar May 04 '22 01:05 znicholls

No I think this is indeed very appropriate. In fact I have written code to replace the time with a uniform calendar in the drift_removal module. We might want to expose that on a higher level, or maybe even consider unifying calendars (at least for monthly data) at the combined_preprocessing level? I think this was proposed by @aaronspring at some point?

I believe there is now also a sleeker way to do that with either xarray or nctime. So maybe a refactor would be appropriate? I think at the minimum, we could build some checking logic into concat_experiments?

jbusecke avatar May 04 '22 15:05 jbusecke

In the meantime xarray introduced convert_calendar which might help

aaronspring avatar May 04 '22 15:05 aaronspring

In the meantime xarray introduced convert_calendar which might help

Unfortunately also not happy

>>> out["GFDL-ESM4.gn.Amon.r1i1p1f1.co2mass"].convert_calendar("standard")
...
File src/cftime/_cftime.pyx:1334, in cftime._cftime.datetime.__richcmp__()

TypeError: cannot compare cftime.DatetimeNoLeap(1850, 1, 16, 12, 0, 0, 0, has_year_zero=True) and cftime.DatetimeJulian(2015, 1, 16, 12, 0, 0, 0, has_year_zero=False)

znicholls avatar May 05 '22 01:05 znicholls

In fact I have written code to replace the time with a uniform calendar in the drift_removal module

Nice, does this also move the dates? I wasn't completely sure looking at it

znicholls avatar May 05 '22 01:05 znicholls

what do you mean by 'move'? The logic I wrote is basically limited to monthly data, so it is not at all general enough.

It seems like working on the upstream xarray stuff would be the best way forward?

jbusecke avatar May 05 '22 18:05 jbusecke

what do you mean by 'move'?

As in, it renames e.g. 1st Jan 1020 to 1st Jan 1850 and shifts all other time points accordingly?

It seems like working on the upstream xarray stuff would be the best way forward?

👍

znicholls avatar May 06 '22 07:05 znicholls