flox icon indicating copy to clipboard operation
flox copied to clipboard

`xarray_reduce` is incompatible with `DataArray.pipe` due to mandatory `func` kwarg.

Open asford opened this issue 10 months ago • 1 comments

Thanks for the awesome work on this library! It's an awesome contribution to the xarray ecosystem. 🙏

xarray_reduce has a mandatory keyword-only argument func, which is the same name used by xarray.DataArray.pipe. This breaks use of xarray_reduce in .pipe, where it'd otherwise be a very natural fit for fluent interfaces of the form:

ds: xa.Dataset = ... needful load of dataset ...

# error, func keyword argument is captured by .pipe 
ds.pipe(xarray_reduce, "group_coordinate", func="mean")

Though I think it'd be natural-ish for .pipe to use a positional-only argument for func, or the python convention for a dunder-prefixed positional argument (i.e. __func), that ship probably sailed a long time ago...

Flox can fix this behavior to fit into .pipe by changing the name of func. I would, personally, recommend using method signatures closer to the xarray standard, where one-or-more dimensions are represented as a single argument dims : Hashable | list[Hashable].

This would look like:

xarray_reduce(data, ["group", "dims"], "sum")
data.pipe(xarray_reduce, ["group", "dims"], "sum")

xarray_reduce(data, "group_dim", "sum")
data.pipe(xarray_reduce, "group_dim", "sum")

but this is (obviously) a breaking change and probably requires a method rename + shim. Maybe, for discoverabilities sake, this could be a new method flox.xarray.groupby_reduce?

xarray-einstats.einops.reduce is a good example of a "pipeable", "unbound" or "xarray extension" compatible interfaces.

asford avatar Mar 24 '24 17:03 asford

nice catch! We are due for some API improvements soon. IN the mean time can you do ds.pipe(partial(xarray_reduce, func="mean"), ...)

dcherian avatar Mar 26 '24 02:03 dcherian