CF accessor adds grid mapping as coordinate
Possibly a duplicate of #357 or a discussion topic. Opinions welcomed.
Description
Since cf_xarray == 0.8.0 (#391), selecting variables by standard name with ds.cf[standard_name] results in a different set of coordinates than selecting them by their short name with ds[short_name]. This prevents re-injecting these variables in a compatible (and even the original) dataset using Dataset.assign() (and possibly other xarray merge operations). Instead trying to Dataset.assign() a cf-picked, georeferenced variable into a raises a MergeError.
Could you help me to understand if this is a design choice or a bug? In my case it comes as a breaking change and I am not exactly sure how to go about (as a cf-xarray fix or downstream in https://github.com/juseg/hyoga/issues/73). Happy to contribute as I can.
Miminal example
import xarray as xr
import cf_xarray
# make grid-mapped fake data (or use xr.open_dataset)
ds = xr.Dataset({
'tas': ((), None, {
'grid_mapping': 'crs',
'standard_name': 'air_temperature'}),
'crs': ((), None, {
'grid_mapping_name': 'latitude_longitude'})})
# ds and ds.cf variables have different coordinates
print('crs' in ds['tas'].coords) # False
print('crs' in ds.cf['air_temperature'].coords) # True
# assign a cf-picked variable (only works on cf_xarray < 0.8.0)
ds.assign(new=ds.cf['air_temperature']) # xarray.core.merge.MergeError
Error message
Traceback (most recent call last):
File "/home/julien/git/code/hyoga/bugs/cfxr08.py", line 17, in <module>
ds.assign(new=ds.cf['air_temperature']) # xarray.core.merge.MergeError
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/xarray/core/dataset.py", line 6080, in assign
data.update(results)
File "/usr/lib/python3.12/site-packages/xarray/core/dataset.py", line 4946, in update
merge_result = dataset_update_method(self, other)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/xarray/core/merge.py", line 1104, in dataset_update_method
return merge_core(
^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/xarray/core/merge.py", line 772, in merge_core
raise MergeError(
xarray.core.merge.MergeError: unable to determine if these variables should be coordinates or not in the merged result: {'crs'}
This is a design choice :) .
Many users complain that xarray's default behaviour isn't what cf-xarray currently does.
Some solutions:
- Extract the variable name manually:
ds[*ds.cf.standard_names['air_temperature']] - Use the underlying variable object:
ds.assign(a=ds.cf['air_temperature'].variable)
xarray.core.merge.MergeError: unable to determine if these variables should be coordinates or not in the merged result: {'crs'}
I also think this should just set crs as a coord. Not sure if there is an xarray issue for this.
This is a design choice :) .
Thank you for the clarification and code samples! :-) I will settle on 2. I think 1 needs double brackets:
ds[[*ds.cf.standard_names['air_temperature']]]
I also think this should just set
crsas a coord. Not sure if there is an xarray issue for this.
I could not find an xarray issue on that (https://github.com/pydata/xarray/issues?q=is%3Aissue+mapping+MergeError). However I am still unconvinced whether grid mappings should be considered coordinates (see https://docs.xarray.dev/en/stable/user-guide/terminology.html#term-Coordinate).
Perhaps that is far-fetched, but would you consider a cf-xarray re-implementation of assign? I think it could assign by standard name, and overcome any xarray incompatibility.
ds.cf.assign(air_temperature=darray)
I do something like that at https://github.com/juseg/hyoga/blob/v0.3.0/hyoga/core/accessor.py#L134-L186.
could not find an xarray issue on that I am still unconvinced whether grid mappings should be considered coordinates
https://github.com/pydata/xarray/issues/6447 . In the xarray model "coordinates" is the only way to link variables to each other. That's why I think they should be... Either way, I think this merge should succeed, whether the variable is promoted to coordinates is a somewhat arbitrary implementation detail.
Perhaps that is far-fetched, but would you consider a cf-xarray re-implementation of assign?
Not far fetched at all! I'd be happy to merge a PR adding this.
OK I understand your point with the coordinates now. And great, I will be working on an assign PR then!
Thanks!