xarray icon indicating copy to clipboard operation
xarray copied to clipboard

Can not use map when grouping by multiple variables

Open joshua-gould opened this issue 2 months ago • 1 comments

What happened?

Exception when using group by and map when grouping by multiple variables.

What did you expect to happen?

No response

Minimal Complete Verifiable Example

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!

import xarray as xr

xr.show_versions()
d = xr.DataArray(
    [[0, 1], [2, 3]],
    coords={
        "lon": (["ny", "nx"], [[30, 40], [40, 50]]),
        "lat": (["ny", "nx"], [[10, 10], [20, 20]]),
    },
    dims=["ny", "nx"],
)

d.groupby(('lon', 'lat')).mean()  # works
d.groupby('lon').map(lambda x: x)  # works
d.groupby(('lon', 'lat')).map(lambda x: x)  # fails
#   File "xarray/core/nputils.py", line 93, in inverse_permutation
#     inverse_permutation[indices] = np.arange(len(indices), dtype=np.intp)
#     ~~~~~~~~~~~~~~~~~~~^^^^^^^^^
# IndexError: arrays used as indices must be of integer (or boolean) type

Steps to reproduce

No response

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output


Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None python: 3.12.11 | packaged by conda-forge | (main, Jun 4 2025, 14:38:53) [Clang 18.1.8 ] python-bits: 64 OS: Darwin OS-release: 25.1.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('C', 'UTF-8') libhdf5: 1.14.6 libnetcdf: None xarray: 2025.12.0 pandas: 2.3.3 numpy: 1.26.4 scipy: 1.16.1 netCDF4: None pydap: None h5netcdf: None h5py: 3.14.0 zarr: 2.18.7 cftime: None nc_time_axis: None iris: None bottleneck: None dask: 2025.11.0 distributed: 2025.11.0 matplotlib: 3.10.3 cartopy: None seaborn: 0.13.2 numbagg: None fsspec: 2025.7.0 cupy: None pint: 0.24.4 sparse: 0.17.0 flox: 0.10.7 numpy_groupies: 0.11.3 setuptools: 80.9.0 pip: 25.3 conda: None pytest: 8.4.2 mypy: None IPython: 9.6.0 sphinx: 8.1.3

joshua-gould avatar Dec 11 '25 15:12 joshua-gould

Nice example!

This is a very silly bug here: https://github.com/pydata/xarray/blob/172ba1ed6b4471aa941af886e0a3eded5c31f372/xarray/core/groupby.py#L180

np.concatenate([[0], [1], [], [2]])#  -> float64
np.concatenate([[0], [1], [2]])#  -> int64

Are you able to send a PR with this test and a change that looks like this:

np.concatenate(tuple(p for p in positions if p))

dcherian avatar Dec 11 '25 16:12 dcherian