pytensor
pytensor copied to clipboard
Should Alloc be pushed downstream of expand_dims
We have some other rewrites that will push Alloc below Elemwise, so that we don't compute on repeated inputs, but this won't happen if there's an expand_dims in the way. As of now the following graph does not get lifted
import pytensor
import pytensor.tensor as pt
x = pt.vector("x", shape=(3,))
y = pt.alloc(x, 1000, 3)[None]
out = pt.exp(y)
pytensor.function([x], out).dprint(print_type=True)
# Exp [id A] <Tensor3(float64, shape=(1, 1000, 3))> 1
# └─ Alloc [id B] <Tensor3(float64, shape=(1, 1000, 3))> 0
# ├─ x [id C] <Vector(float64, shape=(3,))>
# ├─ 1 [id D] <Scalar(int8, shape=())>
# ├─ 1000 [id E] <Scalar(int16, shape=())>
# └─ 3 [id F] <Scalar(int8, shape=())>
There is actually an "uncanonicalize" rewrite that allows "lifting" expand_dims above some Alloc
, which would have helped here.
https://github.com/pymc-devs/pytensor/blob/e6e6d69f6d878786270f1751098b0682e2d8f607/pytensor/tensor/rewriting/uncanonicalize.py#L125-L150
However, this is at odds with the opposite canonical local_alloc_sink_dimshuffle
:
https://github.com/pymc-devs/pytensor/blob/d62f4b19d412d91994dd12362f7976d690911084/pytensor/tensor/rewriting/basic.py#L462-L467
It's not obvious to me why the latter should be given preference. In general it seems like we can always lift expand_dims towards the inputs of the function (as it does not affect number of operations), and sink alloc towards the outputs. But here we are not allowing the "swap" when an expand_dims meets an alloc