array-api icon indicating copy to clipboard operation
array-api copied to clipboard

Should implementing `__imatmul__` really be enforced?

Open seberg opened this issue 3 years ago • 10 comments

I was not around when this may have been discussed in https://github.com/numpy/numpy/pull/21912. In-place matmul is a weird beast, but for NumPy it would be strange to make @= and out-of-place operator. Because of this, NumPy doesn't define it currently, and I am unsure that it should be done. Would NumPy have the tight restriction that the shape must fit and we actually assign back to a (i.e. truly in-place at all operators?).

That would make sense, but even then, it would even be slower and use as much memory as before anyway.

seberg avatar Nov 16 '22 17:11 seberg

I think it's just a simple consistency issue. If all operators have in-place versions, then @ probably should too unless there is a very good reason not to. PEP 465 also defines it - and that came from the NumPy community.

Would NumPy have the tight restriction that the shape must fit and we actually assign back to a (i.e. truly in-place at all operators?).

I'd say yes - that seems like the only consistent option for the semantics to me.

It won't be used much, and yes it won't be faster than the out-of-place version. But I don't think that's a good enough reason to not add it. The semantics are well-defined.

rgommers avatar Nov 16 '22 22:11 rgommers

I don't think it is required. The spec says in-place operators "may" be supported with __iop__ (https://data-apis.org/array-api/latest/API_specification/array_object.html#in-place-operators). If you don't implement it, then it will fallback to the Python object default which uses __matmul__.

Also, I thought I remembered it only being required for instances where the left-hand array shape wouldn't change (i.e., the right-hand operator is square), but I can't find that now.

asmeurer avatar Nov 17 '22 01:11 asmeurer

Yeah just to back up Aaron's comment. From the Python docs:

If a specific method is not defined, the augmented assignment falls back to the normal methods. For instance, if x is an instance of a class with an __iadd__ method, x += y is equivalent to x = x.__iadd__(y) . Otherwise, x.__add__(y) and y.__radd__(x) are considered, as with the evaluation of x + y.

jakirkham avatar Nov 17 '22 05:11 jakirkham

Also https://data-apis.org/array-api/latest/design_topics/copies_views_and_mutation.html#copyview-mutability. When views are avoided, a = a @ b and a @= b are equivalent. I guess a difference would be that the latter, if implemented as __imatmul__ could raise an exception when the shape changes (the spec says that that shouldn't be allowed, so it's out of scope). If a is a view or has a view to it, they aren't the same.

Should the spec say that if views are implemented then in-place operators should actually act in-place? To me it's not necessarily clear that it says that.

asmeurer avatar Nov 17 '22 05:11 asmeurer

@asmeurer, that is what tensorflow does, but I think it is misleading. In NumPy:

arr @= arr

is always an error. This ensures that we don't have the one odd "in-place" operator which isn't in-place at all. I do think that is important.

So the question is whether a library can expect arr @= arr to work in the complicated and very strict case where the shapes align well. From the NumPy perspective, I really don't see a point and I really dislike not implementing it, we need to implement it as an error.

seberg avatar Nov 17 '22 07:11 seberg

OK, I am unsure that it is very useful to support it, but it seems OK.

There is one broadcasting corner case that needs to be decided on in NumPy (which must have been part of the reason why I was unsure this is actually useful):

>>> arr = np.array([1., 2.])
>>> arr @ arr
5.0
>>> arr @= arr
>>> arr
array([5., 5.])

is a bit strange. But at least in-place behavior may be useful for stacks of small matrices (in some future).

seberg avatar Nov 17 '22 10:11 seberg

Can you explain what's going on in your example? How do you end up with [5., 5.]? In NumPy this gives an error so is this an example you made up or is this something that used to work?

asmeurer avatar Nov 17 '22 20:11 asmeurer

Note that the PEP only mentions broadcasting for >2 dimensions, and we also say the same in the spec (because contracted dimensions should never broadcast).

asmeurer avatar Nov 17 '22 20:11 asmeurer

It's an example if NumPy were to implement this (doesn't work today).

This demonstrates the behavior that it would have

import numpy as np

a = np.array([1., 2.])

a1 = a.copy()
np.matmul(a1, a1, out=a1)

The issue is the scalar result gets broadcast and assigned to the whole array.

In any event we are discussing potentially raising an error in this case (as it is odd behavior).

jakirkham avatar Nov 17 '22 20:11 jakirkham

Oh I understand now. I never realized that out allows "double broadcasting" like this. matmul is special because it disallows broadcasting in the last two dimensions. I don't know if it's possible in the ufunc machinery to do it, but IMO it should be an error to let the broadcasting to out.shape apply to the last two dimensions for matmul (and any other function like it that has non-broadcasting dimensions).

In general, contracted dimensions shouldn't broadcast, and to me it makes sense to extend this to the "out" broadcasting that happens in the ufunc (and especially so if that ends up being the semantics for @=).

The spec says:

An in-place operation must not change the data type or shape of the in-place array as a result of Type Promotion Rules or Broadcasting.

An in-place operation must have the same behavior (including special cases) as its respective binary (i.e., two operand, non-assignment) operation. For example, after in-place addition x1 += x2, the modified array x1 must always equal the result of the equivalent binary arithmetic operation x1 = x1 + x2.

So it would already disallow this behavior. But it's probably worth adding a note clarifying what is required for __imatmul__ since it is different from the rest of the operators.

asmeurer avatar Nov 17 '22 20:11 asmeurer