Anomaly in overload of `Ad_array.__rmul__`
Hei,
I have tried to wrap my mind around what is happening here, but I have absolutely no clue
import numpy as np
import porepy as pp
import scipy.sparse as sps
a = pp.ad.Ad_array(np.array([1]), sps.csr_matrix((1,10)))
b = np.array([0.9])
c = b * a
print(type(c)) # c is of type numpy.ndarray
print(type(c[0])) # but it is a numpy array containing Ad_arrays!
The reverse order
c = a * b
would yield the expected result.
Edit:
In debug mode, one can see that in the case
c = b * a
b is unwrapped into the number 0.9 and passed to __rmul__
In the other case
c = a * b
b keeps its type and is passed as a numpy array into __mul__
This is basically numpy being to eager to broadcast its way of of everything, see https://stackoverflow.com/a/6129099 for an explanation of sorts.
I am working on an update of the Ad framework, which will make operations with AdArrays better defined, and take care of special cases to the degree it is possible. However, for this particular case there seems to be no good solution beyond making sure to put the AdArray as the first operand.
Oh my, alright.
Is this issue then obsolete or duplicate?
This is basically numpy being to eager to broadcast its way of of everything, see https://stackoverflow.com/a/6129099 for an explanation of sorts.
I am working on an update of the Ad framework, which will make operations with
AdArraysbetter defined, and take care of special cases to the degree it is possible. However, for this particular case there seems to be no good solution beyond making sure to put the AdArray as the first operand.
Either that, or we should tag it as won't-fix.
I tagged it as such and will revisit the issue after the AD update (and delete it if solved).
Thanks, but I am afraid it will never be solved. It is just what it is.
Hei,
There is an implementation of AD in PyTorch, and they don't face this problem, because they prohibit implicit casting of the ndarrays and scalars to the Tensor class, which is analogous to our DenseArray or pp.ad.Scalar.
We rely on implicit casting to DenseArray, and it seems like it's not possible even to raise a warning if a user tries to do ndarray + DenseArray. This can lead to tricky errors.
However, there is a way to raise an error if the user does so. For this, we need to sacrifice implicit casting of scalars (not ndarrays) to the AD objects. This means, operations such as DenseArray(...) + 1.25 will become invalid. Instead, we'll write DenseArray(...) + pp.ad.Scalar(1.25).
Numpy doesn't know how to add a AD object to a ndarray (ndarray + DenseArray), but it can see that we allow adding elements of the ndarray elementwise. Thus, it adds scalars to DenseArray elementwise. The fix is to prohibit this in the Operator code, so Numpy will throw an error.
Example implementation:
class MyFixedArray(pp.ad.DenseArray):
def __add__(self, other):
if isinstance(other, (float, int)):
return NotImplemented
return super().__add__(other)
a = np.arange(3)
b = pp.ad.DenseArray(values=np.ones(3))
c = MyFixedArray(values=np.ones(3))
Error reproduced:
>>> a + b
array([Operator '+ operator' formed by Operations.add with 2 children.,
Operator '+ operator' formed by Operations.add with 2 children.,
Operator '+ operator' formed by Operations.add with 2 children.],
dtype=object)
Fixed:
>>> a + c
TypeError: unsupported operand type(s) for +: 'int' and 'MyFixedArray'
Thanks @Yuriyzabegaev, this is most useful to know. We may not pursue this further, at least not in the near future, but I think you are right this is the approach we will need to take if we ever try to tackle it.
To avoid confusion. It seems that I mixed up PyTorch with something else. In fact, they allow mixing scalars and ndarrays with the Tensor class. But the operation ndarray + Tensor raises an error. I don't know how they achieve this behavior.
For this discussion, it might be useful to know:
For two classes with own overloads of __add__ and __radd__ (same for multiplication). Python first calls the __add__ overload of the left-hand side argument to the binary operation a + b
So most likely, in numpy there is an iteration inside __add__, calling + with the right-hand side operand, resulting in python
calling __radd__ of the class of b inside __add__ of a, if a is a numpy array.