pint-xarray icon indicating copy to clipboard operation
pint-xarray copied to clipboard

implement the `PintMetaIndex`

Open keewis opened this issue 2 years ago • 23 comments

As mentioned in #162, it is possible to get the indexing functions to work, although there still is no public API.

I also still don't quite understand how other methods work since the refactor, so this only implements sel.

Usage, for anyone who wants to play around with it
import xarray as xr
from pint_xarray.index import PintMetaIndex

ds = xr.tutorial.open_dataset("air_temperature")
arr = ds.air

new_arr = xr.DataArray(
    arr.variable,
    coords={
        "lat": arr.lat.variable,
        "lon": arr.lon.variable,
        "time": arr.time.variable,
    },
    indexes={
        "lat": PintMetaIndex(arr.xindexes["lat"], {"lat": arr.lat.attrs.get("units")}),
        "lon": PintMetaIndex(arr.xindexes["lon"], {"lon": arr.lon.attrs.get("units")}),
        "time": arr.xindexes["time"],
    },
    fastpath=True,
)
new_arr.sel(
    lat=ureg.Quantity([75, 70, 65], "deg"),
    lon=ureg.Quantity([200, 202.5], "deg"),
)

This will fail at the moment because xarray treats dask arrays differently from duck-dask arrays, but passing single values works!

  • [x] Closes #162, closes #205, closes #218
  • [ ] Tests added
  • [x] Passes pre-commit run --all-files
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [ ] New functions/methods are listed in api.rst

keewis avatar Mar 26 '22 00:03 keewis

I also still don't quite understand how other methods work since the refactor, so this only implements sel.

Here are a few comments. Happy to answer questions if any.

There are some Index methods of like isel, stack, rename that I guess do not depend on units and where you could just forward the call + args to the wrapped index.

For some other methods like concat, join, equals, reindex, the easiest would probably to require that the other indexes given as arguments must have the same units (and raise an error otherwise). An alternative option would be to implicitly convert the units of those indexes, but it's difficult to do that in an index-agnostic way unless we define and adopt in Xarray some protocol for that purpose.

The general approach used in the Xarray indexes refactor heavily relies on the type of the indexes (at least when we need to compare them together). That's not super flexible with the PintMetaIndex solution here, but I think it's reasonable enough. For example, alignment will only work when the indexes found for a common set of coordinates / dimensions are all PintMetaIndex objects. This might behave weirdly when the PintMetaIndex objects do not wrap indexes of the same type, although this would be easy to check.

I wonder whether whether PintMetaIndex should accept one unit or a map of units. The latter makes sense if, e.g., you want to reuse it to wrap multi-indexes where the corresponding coordinate variables (levels) may have different units.

Regarding Index methods like from_variables and create_variables, I guess PintMetaIndex would implement a lightweight wrapping layer to respectively get and assign a pint quantity from / to the Xarray variables.

You should also be careful when converting the units of indexed coordinates as it may get out of sync with their index. As there's no concept of "duck" index, the easiest would probably be to drop the index (and maybe reconstruct it from scratch) when the coordinates are updated.

benbovy avatar Mar 28 '22 13:03 benbovy