array-api icon indicating copy to clipboard operation
array-api copied to clipboard

RFC: `linalg.outer` support batches of vectors

Open AnirudhDagar opened this issue 2 years ago • 5 comments

Current Status

Current API spec defines the linalg.outer behaviour for one-dimensional vector inputs only.

Quoting from the docs.

Computes the outer product of two vectors x1 and x2 . Parameters

  • x1 :
    • first one-dimensional input array of size N . Should have a numeric data type.
  • x2 :
    • second one-dimensional input array of size M . Should have a numeric data type.

Proposal

Most of the array/tensor libraries involve the manipulation and usage of batched vectors. It would be worth considering batch support in linalg.outer for the Array API standard, instead of only the 1D Vectors. This will also make linalg.outer more in line with the behaviour offered by other linalg functions in the spec which generally take in an nd-array/tensor.


Interestingly

  • PyTorch currently has torch.outer which also only supports 1d Tensors.
  • NumPy though supports nd-arrays with numpy.outer, but the behaviour is to flatten the inputs to 1d Vectors in case the input is not already 1-dimensional.

This was initially discussed in pytorch/pytorch#63293.

cc @Lezcano @rgommers

AnirudhDagar avatar Aug 16 '21 20:08 AnirudhDagar

It would be great if an array library (e.g., Torch) would implement such behavior. Without such an implementation, we'd be blazing a new path here, one which is incompatible with existing behavior. For NumPy, moving to batch would be a breaking change. For Torch, this too would be a breaking change (given the 1D requirement), but Torch also seems more generally willing to introduce breaking changes.

kgryte avatar Aug 16 '21 20:08 kgryte

I think that we will be adding this behaviour as we add this function to torch.linalg.

Note that the change in Torch would not be BC-breaking. We tend to be quite happy to add changes as long as they are backwards compatible :)

lezcano avatar Aug 17 '21 09:08 lezcano

For reference, the array API currently supports batching for both vector_norm, cross, and vecdot.

As mentioned previously, NumPy does not have a batched outer API, but does have ufunc.outer which supports multidimensional arrays; however, this functionality is the equivalent of the tensor product and may also be accomplished with tensordot.

Accordingly, there is no API currently for performing batch outer products. I will raise this at the next consortium meeting to determine whether there is appetite for adding support in the standard.

kgryte avatar Oct 04 '21 18:10 kgryte

Based on discussions in recent consortium meetings, adding support for batching in linalg.outer has received positive support and is tentatively slated for the 2022 revision of the array API standard.

kgryte avatar Apr 18 '22 07:04 kgryte

This still wasn't implemented in PyTorch. We should probably bump this, and deprecate the current numpy behavior (which is pretty bad, so deserves deprecation anyway even if it wasn't for this standard) first.

rgommers avatar Nov 28 '22 16:11 rgommers