Deprecate the multi-index dimension coordinate
- [ ] Tests added
- [ ] User visible changes (including notable bug fixes) are documented in
whats-new.rst
This PR adds a future_no_mindex_dim_coord=False option that, if set to True, enables the future behavior of PandasMultiIndex (i.e., no added dimension coordinate with tuple values):
import xarray as xr
ds = xr.Dataset(coords={"x": ["a", "b"], "y": [1, 2]})
ds.stack(z=["x", "y"])
# <xarray.Dataset>
# Dimensions: (z: 4)
# Coordinates:
# * z (z) object MultiIndex
# * x (z) <U1 'a' 'a' 'b' 'b'
# * y (z) int64 1 2 1 2
# Data variables:
# *empty*
with xr.set_options(future_no_mindex_dim_coord=True):
ds.stack(z=["x", "y"])
# <xarray.Dataset>
# Dimensions: (z: 4)
# Coordinates:
# * x (z) <U1 'a' 'a' 'b' 'b'
# * y (z) int64 1 2 1 2
# Dimensions without coordinates: z
# Data variables:
# *empty*
There are a few other things that we'll need to adapt or deprecate:
- Dropping multi-index dimension coordinate de-facto allows having several multi-indexes along the same dimension. Normally
stackshould already take this into account, but there may be other places where this is not yet supported or where we should raise an explicit error. - Deprecate
Dataset.reorder_levels: API is not compatible with the absence of dimension coordinate and several multi-indexes along the same dimension. I think it is OK to deprecate such edge case, which alternatively could be done by extracting the pandas index, updating it and then re-assign it to a the dataset withassign_coords(xr.Coordinates.from_pandas_multiindex(...)) - The text-based repr: in the example above,
Dimensions without coordinate: zdoesn't make much sense - ... ?
I started updating the tests, although this will be much easier once #8140 is merged. This is something that we could also easily split into multiple PRs. It is probably OK if some features are (temporarily) breaking badly when setting future_no_mindex_dim_coord=True.
I've been trying to use .set_xindex more, and rely less on MultiIndexes. It's overall worked really well! I do think there's a better world just over the horizon...
One thing I haven't managed to do is .unstack without MultiIndexes — lmk if I'm missing something. There's a possible API like .unstack("foo", "bar") which asserts foo & bar are indexes along the same dimensions, and unstacks them, without needing a multiindex...