Avoid explicitly broadcasting RandomVariable inputs

Open ricardoV94 opened this issue 4 months ago • 0 comments

Description

RVs broadcast batch inputs by default. We are not avoiding materializing broadcast (which in PyTensor is always dense), similar to how we're failing to do it with #1561

import pytensor
import pytensor.tensor as pt

mu = pt.vector("x")
sigma = pt.vector("sigma")
mu_b, sigma_b = pt.broadcast_arrays(mu[:, None], sigma[None, :])
out = pt.random.normal(mu_b, sigma_b)

fn = pytensor.function([mu, sigma], [out])
fn.dprint()
# normal_rv{"(),()->()"}.1 [id A] 4
#  ├─ RNG(<Generator(PCG64) at 0x7F9655BAC120>) [id B]
#  ├─ NoneConst{None} [id C]
#  ├─ Second [id D] 3
#  │  ├─ ExpandDims{axis=0} [id E] 0
#  │  │  └─ sigma [id F]
#  │  └─ ExpandDims{axis=1} [id G] 1
#  │     └─ x [id H]
#  └─ Second [id I] 2
#     ├─ ExpandDims{axis=1} [id G] 1
#     │  └─ ···
#     └─ ExpandDims{axis=0} [id E] 0
#        └─ ···

The Second operation will iterate over each entry of sigma to copy mu, and over each entry of mu to copy sigma. We should write those as Alloc, but that's currently not implemented: https://github.com/pymc-devs/pytensor/blob/6770f46ed575f8f2d5367146d678d00c1c0b1c0b/pytensor/tensor/rewriting/basic.py#L402-L405

More importantly we shouldn't do this at all for RVs, as they automatically broadcast! This is equivalent to pt.random.normal(mu[:, None], sigma[None, :])

This redundant broadcast is also true when there's a size argument:

import pytensor
import pytensor.tensor as pt

mu = pt.vector("x")
out = pt.random.exponential(pt.broadcast_to(mu, (5, 3)), size=(5, 3))

fn = pytensor.function([mu], [out])
fn.dprint()
# exponential_rv{"()->()"}.1 [id A] 1
#  ├─ RNG(<Generator(PCG64) at 0x7F9655BAF220>) [id B]
#  ├─ [5 3] [id C]
#  └─ Alloc [id D] 0
#     ├─ x [id E]
#     ├─ 5 [id F]
#     └─ 3 [id G]

An alloc on batch dimensions is never necessary if we have size.

Jul 29 '25 11:07 ricardoV94