executorch
executorch copied to clipboard
Both optimized and portable arithmetic operators have broadcasting errors
🐛 Describe the bug
Subtraction is behaving weirdly when we have to broadcast both x and y.
Repro code snippet:
import torch
from torch.export import Dim
from executorch.exir import to_edge
from executorch.runtime import Runtime
class Model(torch.nn.Module):
def forward(self, x, y):
return x - y
dim = Dim("dim", min=1, max=1024)
ep = torch.export.export(
Model(), (torch.ones(1, 4), torch.ones(4, 1)), dynamic_shapes=({1: dim}, {0: dim})
)
pte = to_edge(ep).to_executorch().buffer
runtime = Runtime.get()
program = runtime.load_program(pte)
method = program.load_method("forward")
# run sub with (1, 2) and (2, 1) first, result should be zeros(2, 2)
method.execute((torch.ones(1, 2), torch.ones(2, 1)))
# run sub with (1, 4) and (4, 1), result should be zeros(4, 4) but actually getting zeros(1, 4)
method.execute((torch.ones(1, 4), torch.ones(4, 1)))
"""
Actual
[tensor([[0., 0., 0., 0.]])]
Expected
[tensor([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])]
"""
Versions
Latest
cc @manuelcandales
cc @manuelcandales
cc @swolchok
hm. portable sub should be working fine:
https://github.com/pytorch/executorch/blame/main/kernels/portable/cpu/op_sub.cpp#L47
(@larryliu0820 are you sure you're running portable?)
optimized just plain doesn't seem to have a resize except in one specific case, which seems obviously broken:
https://github.com/pytorch/executorch/blame/main/kernels/optimized/cpu/op_sub.cpp#L108
I think I need to learn why https://github.com/pytorch/executorch/blob/main/kernels/test/op_sub_test.cpp#L266 doesn't cover this case.
om/larryliu0820) are you sure you're running portable?)
optimized just plain doesn't seem to have a resize except in one specific case, which seems obviously broken:
https://github.com/pytorch/executorch/blame/main/kernels/optimized/cpu/op_sub.cpp#L108
Are you sure? I do see this https://github.com/pytorch/executorch/blob/3072890824adfb95b02159746a130751d184c950/kernels/optimized/cpu/op_add_sub_impl.h#L90
Are you sure? I do see this
OK, two specific cases. that resize is also gated. there are clearly uncovered cases
findings:
- the specific regression test I attempted to add for op_sub passes for both portable and optimized.
- I was wrong when I said "there are clearly uncovered cases" above. I wasn't reading closely; the uncovered cases funnel into utility functions that do the necessary resize.
best guess is that there is something wrong in the Python or lowering layer in the report here; the kernels seem to be correct.
@larryliu0820 your repro script doesn't run for me. perhaps you are using a different torch version than the pinned one?
(.venv) swolchok@swolchok-mac ~/src/executorch> python repro.py
Traceback (most recent call last):
File "/Users/swolchok/src/executorch/repro.py", line 11, in <module>
dim = Dim.AUTO(min=1, max=1024)
TypeError: '_DimHint' object is not callable
over to you to fix repro
@swolchok what is your full error? Is it complaining Dim.AUTO doesn't exist?
@larryliu0820 sorry I missed a line while pasting. edited.
@swolchok try
dim = Dim("dim", min=1, max=1024)
instead
repros. interestingly, if I comment out the first method.execute call, I get the correct result from the second call.
This is not an operator-specific bug. Also repros with mul/add/div, but not torch.max. I'm probably not the best person to look into this.
repros. interestingly, if I comment out the first method.execute call, I get the correct result from the second call.
Yeah, I'm guessing something in Program or Method is stateful. Unlikely something inside kernel since it can't keep any state
@JacobSzwejbka do you know anything in Program/Method that could cause this?