`at.square(x)` and `at.pow(x, 2)` don't etuplize to the same expresssion
import aesara.tensor as at
from etuples import etuplize
a = at.scalar('a')
etuplize(at.square(a))
# e(e(<class 'aesara.tensor.elemwise.Elemwise'>, sqr, <frozendict {}>), a)
etuplize(at.pow(a, 2))
# e(e(<class 'aesara.tensor.elemwise.Elemwise'>, pow, <frozendict {}>), a, TensorConstant{2})
Which means that a**2 will fail to unify with etuplize(at.square(a)):
print(etuplize(a ** 2))
# e(e(<class 'aesara.tensor.elemwise.Elemwise'>, pow, <frozendict {}>), a, TensorConstant{2})
I guess this question is more about why at.square is not an alias to at.pow(..., 2).
I guess this question is more about why at.square is not an alias to at.pow(..., 2).
Probably optimization. Wouldn't be surprised if squaring was faster than power(x, 2)
So that could eventually become a rewrite? And is that universally true for every backend who is that distinction tied to the C backend?
We already have some rewrites: https://github.com/aesara-devs/aesara/blob/ec82b9f7ac1bdab73110afb8ee2e1a1517b8755d/aesara/tensor/rewriting/math.py#L1892
I guess you mean we might be missing an intermediate canonicalization form that is the same for either graph
I guess you mean we might be missing an intermediate canonicalization form that is the same for either graph
~~Yes.~~
Then we can have at.square(x) be an alias for at.pow(x, 2), which is a question of consistency of the representation in the IR. And then let the canonicalisation handle the computation cost concerns.
Note that at.reciprocal(x) should be an alias for at.pow(x, -1) and at.sqrt(x) an alias for at.pow(x, 0.5) for the same consistency reasons.
Consistency is going to prevent special-casing the code in downstream libraries, for instance in AePPL.