array-api icon indicating copy to clipboard operation
array-api copied to clipboard

Be more clear about broadcasting behavior in setitem

Open asmeurer opened this issue 2 years ago • 2 comments

NumPy allows this behavior in setitem

>>> import numpy as np
>>> x = np.empty((2, 3, 4))
>>> a = np.empty((1, 3, 4))
>>> x[1, ...].shape
(3, 4)
>>> x[1, ...] = a

This is sort of a "reverse broadcasting". The rhs can't be broadcast to the shape of the lhs, but it can be made into that shape by adding a size 1 dimension.

We should more explicitly disallow this in the spec. Right now the spec just says:

As implied by the broadcasting algorithm, in-place element-wise operations must not change the shape of the in-place array as a result of broadcasting.

I think it should also say something like "the shape of the right-hand side of an operation must be broadcastable into the shape of the left-hand side, after applying any indices". It's also not completely clear that this statement applies to plain __setitem__ as an "in-place operation".

Note that the NumPy developers have expressed interest in deprecating this behavior in NumPy https://github.com/numpy/numpy/pull/10615#issuecomment-920987394 (@seberg). It's also worth noting that NumPy's documentation says

You may use slicing to set values in the array, but (unlike lists) you can never grow the array. The size of the value to be set in x[obj] = value must be (broadcastable) to the same shape as x[obj]."

which itself doesn't quite capture this "reverse broadcasting" semantics.

Currently the numpy.array_api implementation allows the above to work. I haven't checked any other array libraries.

asmeurer avatar May 03 '22 22:05 asmeurer

Can probably simplify the example here using arrays which are explicitly not of the same rank:

In [15]: x = np.zeros((3,4))

In [16]: x
Out[16]: 
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [17]: x.shape
Out[17]: (3, 4)

In [18]: y = np.ones((1,3,4))

In [19]: y
Out[19]: 
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [20]: y.shape
Out[20]: (1, 3, 4)

In [21]: x[...] = y

In [22]: x
Out[22]: 
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [23]: x.shape
Out[23]: (3, 4)

In [24]: z = np.ones((1,1,1,1,1,1,1,1,1,1,3,4))*2

In [25]: z.shape
Out[25]: (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 4)

In [26]: x[...] = z

In [27]: x
Out[27]: 
array([[2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.]])

Agreed that this behavior is likely not desirable.

kgryte avatar May 05 '22 06:05 kgryte

To summarize an observation in the meeting, this behavior seems to be using an implicit squeeze. Given squeeze can be error prone (even when explicit), it seems reasonable to raise to avoid subtle bugs creeping into user code.

jakirkham avatar May 05 '22 18:05 jakirkham