xarray
xarray copied to clipboard
apply_ufunc and Datasets with variables without the core dimension
Is your feature request related to a problem?
Consider this example
ds = xr.Dataset({"a": ("x", [1, 2, 3]), "b": ("y", [1, 2, 3])})
xr.apply_ufunc(np.mean, ds, input_core_dims=[["x"]])
This raises
ValueError: operand to apply_ufunc has required core dimensions ['x'], but some of these dimensions are absent on an input variable: ['x']
because core dimension x is missing on variable b. This behaviour makes it annoying to use apply_ufunc on Datasets.
Describe the solution you'd like
Add a new kwarg to apply_ufunc called missing_core_dim that controls how to handle variables without all input core dimensions. This kwarg could take one of two values:
"raise"- raise an error, current behaviour"copy"- skip applying the function and copy the variable from input to output.
Describe alternatives you've considered
No response
Additional context
No response
I would like to have a "ignore" option as well, such that I can do input_core_dims=[["x","y"]] in your example.
@shoyer what do you think of adding such a kwarg?
The logic in Dataset.reduce is a bit nuanced, it keeps coordinate variables but runs the function on variables without core dimensions.
https://github.com/pydata/xarray/blob/e2b6f3468ef829b8a83637965d34a164bf3bca78/xarray/core/dataset.py#L6740-L6751
Belatedly -- I like this idea!
Check out #8138 for a draft impl!