Mario Lezcano Casado
Mario Lezcano Casado
Let's just wait for @bdhirsh to be back then, as this issue is not blocking anything.
@vfdev-5 run the torchbench suite on the hud, please.
If CPU benchmarks are not run on the hud, then it may be better for them to be run by someone from intel so that they are run on the...
Actually, I don't think we have a nice way to perform graph transformations at an IR level... We could do something like what we do for the int64 -> int32...
We are implementing something that's a bit less generic than a masked load, but that works for all these ops in https://github.com/pytorch/pytorch/pull/116491
@isuruf what happened with this PR?
This op is just to be used by inductor and other compilers. We would make an inductor-only prim, but we want to support Autograd as well, that's why we're putting...
Sorry, didn't mean to approve yet.
You'll need to see if all the uses of the masked load in the lowerings that we want to implement with this op have a mask that's exactly equal to...
>This sounds a bit problematic to me. What if the decomposition uses indices in the range [0, size - 1) and -1 should be masked out? My question is, is...