array-api
array-api copied to clipboard
Be more clear about broadcasting behavior in setitem
NumPy allows this behavior in setitem
>>> import numpy as np
>>> x = np.empty((2, 3, 4))
>>> a = np.empty((1, 3, 4))
>>> x[1, ...].shape
(3, 4)
>>> x[1, ...] = a
This is sort of a "reverse broadcasting". The rhs can't be broadcast to the shape of the lhs, but it can be made into that shape by adding a size 1 dimension.
We should more explicitly disallow this in the spec. Right now the spec just says:
As implied by the broadcasting algorithm, in-place element-wise operations must not change the shape of the in-place array as a result of broadcasting.
I think it should also say something like "the shape of the right-hand side of an operation must be broadcastable into the shape of the left-hand side, after applying any indices". It's also not completely clear that this statement applies to plain __setitem__
as an "in-place operation".
Note that the NumPy developers have expressed interest in deprecating this behavior in NumPy https://github.com/numpy/numpy/pull/10615#issuecomment-920987394 (@seberg). It's also worth noting that NumPy's documentation says
You may use slicing to set values in the array, but (unlike lists) you can never grow the array. The size of the value to be set in x[obj] = value must be (broadcastable) to the same shape as x[obj]."
which itself doesn't quite capture this "reverse broadcasting" semantics.
Currently the numpy.array_api implementation allows the above to work. I haven't checked any other array libraries.
Can probably simplify the example here using arrays which are explicitly not of the same rank:
In [15]: x = np.zeros((3,4))
In [16]: x
Out[16]:
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
In [17]: x.shape
Out[17]: (3, 4)
In [18]: y = np.ones((1,3,4))
In [19]: y
Out[19]:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
In [20]: y.shape
Out[20]: (1, 3, 4)
In [21]: x[...] = y
In [22]: x
Out[22]:
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
In [23]: x.shape
Out[23]: (3, 4)
In [24]: z = np.ones((1,1,1,1,1,1,1,1,1,1,3,4))*2
In [25]: z.shape
Out[25]: (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 4)
In [26]: x[...] = z
In [27]: x
Out[27]:
array([[2., 2., 2., 2.],
[2., 2., 2., 2.],
[2., 2., 2., 2.]])
Agreed that this behavior is likely not desirable.
To summarize an observation in the meeting, this behavior seems to be using an implicit squeeze
. Given squeeze
can be error prone (even when explicit), it seems reasonable to raise to avoid subtle bugs creeping into user code.