cf-xarray
cf-xarray copied to clipboard
Decoder for MultiIndexes fails if there are other variables, using a dimension which is part of the multiindex
First, thank you so much. Compression-by-gathering is an incredibly usefull addition, which hopefully will end up in xarray for ragged (or sparse) array support on netcdf's. one day.
#321 added support encoding and decoding for Pandas multi-indexes using "compression by gathering". However if there are other variables in the dataset using a dimension which is part of the multiindex, decode fails.
Minimum example, is a single line addition of var_with_lat
, derived from the Encoding and decoding tutorial:
ds = xr.Dataset(
{"landsoilt": ("landpoint", np.random.randn(4), {"foo": "bar"})},
{
"landpoint": pd.MultiIndex.from_product(
[["a", "b"], [1, 2]], names=("lat", "lon")
)
},
)
# ADDING THIS LINE WILL FAIL THE DECODING PROCESS.
# ds["var_with_lat"] = xr.DataArray([1,2], dims="lat")
encoded = cfxr.encode_multi_index_as_compress(ds, "landpoint")
decoded = cfxr.decode_compress_to_multi_index(encoded, "landpoint")
Once var_with_lat
is added, decoding fails:
---> [129](file:///home/mirico/git/Curvefit/tests/scratch%20copy.py?line=128) decoded = cfxr.decode_compress_to_multi_index(encoded, "landpoint")
File [~/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py:116](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2232302e37372e32382e323139222c2275736572223a226d697269636f227d.vscode-resource.vscode-cdn.net/home/mirico/git/~/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py:116), in decode_compress_to_multi_index(encoded, idxnames)
[110](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=109) from xarray.indexes import PandasMultiIndex
[112](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=111) variables = {
[113](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=112) dim: encoded[dim].isel({dim: xr.Variable(data=index, dims=idxname)})
[114](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=113) for dim, index in zip(names, indices)
[115](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=114) }
--> [116](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=115) decoded = decoded.assign_coords(variables).set_xindex(
[117](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=116) names, PandasMultiIndex
[118](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=117) )
[119](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=118) except ImportError:
[120](file:///home/mirico/devenv3/lib/python3.11/site-packages/cf_xarray/coding.py?line=119) arrays = [encoded[dim].data[index] for dim, index in zip(names, indices)]
File [~/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py:4330](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a2232302e37372e32382e323139222c2275736572223a226d697269636f227d.vscode-resource.vscode-cdn.net/home/mirico/git/~/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py:4330), in Dataset.set_xindex(self, coord_names, index_cls, **options)
[4327](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4326) indexed_coords = set(coord_names) & set(self._indexes)
[4329](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4328) if indexed_coords:
-> [4330](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4329) raise ValueError(
[4331](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4330) f"those coordinates already have an index: {indexed_coords}"
[4332](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4331) )
[4334](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4333) coord_vars = {name: self._variables[name] for name in coord_names}
[4336](file:///home/mirico/devenv3/lib/python3.11/site-packages/xarray/core/dataset.py?line=4335) index = index_cls.from_variables(coord_vars, options=options)
ValueError: those coordinates already have an index: {'lat'}