xarray icon indicating copy to clipboard operation
xarray copied to clipboard

`isel(multi_index_level_name = MultiIndex.level)` corrupts the MultiIndex

Open dcherian opened this issue 1 year ago • 1 comments

What happened?

From https://github.com/pydata/xarray/discussions/8951

if d is a MultiIndex-ed dataset with levels (x, y, z), and m is a dataset with a single coord x m.isel(x=d.x) builds a dataset with a MultiIndex with levels (y, z). This seems like it should work.

cc @benbovy

What did you expect to happen?

No response

Minimal Complete Verifiable Example

import pandas as pd, xarray as xr, numpy as np

xr.set_options(use_flox=True)

test = pd.DataFrame()
test["x"] = np.arange(100) % 10
test["y"] = np.arange(100)
test["z"] = np.arange(100)
test["v"] = np.arange(100)

d = xr.Dataset.from_dataframe(test)
d = d.set_index(index = ["x", "y", "z"])
print(d)

m = d.groupby("x").mean()
print(m)

print(d.xindexes)
print(m.isel(x=d.x).xindexes)

xr.align(d, m.isel(x=d.x))
#res = d.groupby("x") - m
#print(res)
<xarray.Dataset>
Dimensions:  (index: 100)
Coordinates:
  * index    (index) object MultiIndex
  * x        (index) int64 0 1 2 3 4 5 6 7 8 9 0 1 2 ... 8 9 0 1 2 3 4 5 6 7 8 9
  * y        (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
  * z        (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
Data variables:
    v        (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
<xarray.Dataset>
Dimensions:  (x: 10)
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
    v        (x) float64 45.0 46.0 47.0 48.0 49.0 50.0 51.0 52.0 53.0 54.0
Indexes:
  ┌ index    PandasMultiIndex
  │ x
  │ y
  └ z
Indexes:
  ┌ index    PandasMultiIndex
  │ y
  └ z
ValueError...

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

dcherian avatar Apr 17 '24 15:04 dcherian

I think this occurs in the case of fancy indexing of an xarray object (i.e., provide another DataArray as indexer argument to isel) where the same coordinate name is found in both the indexed object and the indexer.

Remove the name conflict and it works fine, e.g.,

xr.align(d, m.rename(x="w").isel(w=d.x))

In such case, the coordinate in the indexer should probably be passed to the result instead of the one found in the indexed object (not the current behavior, although I haven't checked how the coordinates are merged in the result).

benbovy avatar Apr 18 '24 13:04 benbovy