array-api
array-api copied to clipboard
RFC: `linalg.outer` support batches of vectors
Current Status
Current API spec defines the linalg.outer
behaviour for one-dimensional vector inputs only.
Quoting from the docs.
Computes the outer product of two vectors x1 and x2 . Parameters
- x1 :
- first one-dimensional input array of size N . Should have a numeric data type.
- x2 :
- second one-dimensional input array of size M . Should have a numeric data type.
Proposal
Most of the array/tensor libraries involve the manipulation and usage of batched vectors. It would be worth considering batch support in linalg.outer
for the Array API standard, instead of only the 1D Vectors. This will also make linalg.outer
more in line with the behaviour offered by other linalg functions in the spec which generally take in an nd-array/tensor.
Interestingly
- PyTorch currently has
torch.outer
which also only supports 1d Tensors. - NumPy though supports nd-arrays with
numpy.outer
, but the behaviour is to flatten the inputs to 1d Vectors in case the input is not already 1-dimensional.
This was initially discussed in pytorch/pytorch#63293.
cc @Lezcano @rgommers
It would be great if an array library (e.g., Torch) would implement such behavior. Without such an implementation, we'd be blazing a new path here, one which is incompatible with existing behavior. For NumPy, moving to batch would be a breaking change. For Torch, this too would be a breaking change (given the 1D requirement), but Torch also seems more generally willing to introduce breaking changes.
I think that we will be adding this behaviour as we add this function to torch.linalg
.
Note that the change in Torch would not be BC-breaking. We tend to be quite happy to add changes as long as they are backwards compatible :)
For reference, the array API currently supports batching for both vector_norm
, cross
, and vecdot
.
As mentioned previously, NumPy does not have a batched outer
API, but does have ufunc.outer
which supports multidimensional arrays; however, this functionality is the equivalent of the tensor product and may also be accomplished with tensordot
.
Accordingly, there is no API currently for performing batch outer products. I will raise this at the next consortium meeting to determine whether there is appetite for adding support in the standard.
Based on discussions in recent consortium meetings, adding support for batching in linalg.outer
has received positive support and is tentatively slated for the 2022 revision of the array API standard.
This still wasn't implemented in PyTorch. We should probably bump this, and deprecate the current numpy behavior (which is pretty bad, so deserves deprecation anyway even if it wasn't for this standard) first.