xarray icon indicating copy to clipboard operation
xarray copied to clipboard

`Dataset.broadcast_like(other)` should broadcast against like variables in other

Open headtr1ck opened this issue 3 years ago • 7 comments

Is your feature request related to a problem?

I am a bit puzzled about how xarrays is broadcasting Datasets. It seems to always add all dimensions to all variables. Is this what you want in general?

See this example:

import xarray as xr

da = xr.DataArray([[1, 2, 3]], dims=("x", "y"))
# <xarray.DataArray (x: 1, y: 3)>
# array([[1, 2, 3]])
ds = xr.Dataset({"a": ("x", [1]), "b": ("z", [2, 3])})
# <xarray.Dataset>
# Dimensions:  (x: 1, z: 2)
# Dimensions without coordinates: x, z
# Data variables:
#     a        (x) int32 1
#     b        (z) int32 2 3
ds.broadcast_like(da)

# returns:
# <xarray.Dataset>
# Dimensions:  (x: 1, y: 3, z: 2)
# Dimensions without coordinates: x, y, z
# Data variables:
#     a        (x, y, z) int32 1 1 1 1 1 1
#     b        (x, y, z) int32 2 3 2 3 2 3

# I think it should return:
# <xarray.Dataset>
# Dimensions:  (x: 1, y: 3, z: 2)
# Dimensions without coordinates: x, y, z
# Data variables:
#     a        (x, y) int32 1 1 1  # notice here without "z" dim
#     b        (x, y, z) int32 2 3 2 3 2 3

Describe the solution you'd like

I would like broadcasting to behave the same way as e.g. a simple addition. In the upper example da + ds produces the dimensions that I want.

Describe alternatives you've considered

ds + xr.zeros_like(da) this works, but seems more like a "dirty hack".

Additional context

Maybe one can add an option to broadcasting that controls this behavior?

headtr1ck avatar Apr 30 '22 17:04 headtr1ck

see also #6304 which covers xr.broadcast

keewis avatar Apr 30 '22 18:04 keewis

see also #6304 which covers xr.broadcast

I tried adding a join input to Dataset.broadcast_like and passing it to align, but that did not work (at least for join="inner"). Still got the same result...

headtr1ck avatar Apr 30 '22 18:04 headtr1ck

related to https://github.com/pydata/xarray/issues/6227

headtr1ck avatar May 01 '22 14:05 headtr1ck

I keep misunderstanding this issue so typing this out to make sure I got it right.

Writing out dimension names in square brackets

ds['a': 'x', 'b': 'z'].broadcast_like(da: ['x', 'y']) -> ds['a': ['x', y'], 'b': ['x', 'y', 'z']]

IIUC the request is to avoid broadcasting the variables in ds against each other, and to only broadcast each variable against da separately. Did I get it right?

dcherian avatar Feb 06 '25 15:02 dcherian

@mjwillson posted a nice summary in #10031 :

I had to summarize the overall problem, it's that behaviour of xarray.broadcast (and Dataset.broadcast_like etc) is not consistent with how actual arithmetic operations broadcast, in cases where Datasets are involved.

dcherian avatar Feb 06 '25 15:02 dcherian

In the light of #10031, and broadcasting a dataset to itself not behaving as a no-op, should we label this as "bug" rather than "enhancement"?

alvarosg avatar Feb 06 '25 15:02 alvarosg

I found a use-case for broadcasting a dataset like requested: When trying to change values in-place, this only works if the replacement values are already broadcasted. An example can be found here: https://github.com/pydata/xarray/blob/main/xarray/computation/computation.py#L842 where instead of broadcasting I had to resort to adding a zero-valued dataset.

headtr1ck avatar Oct 27 '25 20:10 headtr1ck