tvm
tvm copied to clipboard
[Bug][Frontend][PyTorch] Bug in the aten::fill_() PyTorch operator implementation?
Potentially, there is an issue with implementation of PyTorch aten::fill_()
operator in the PyTorch frontend.
I have found this bug when I was trying to run gallery/how_to/deploy_models/deploy_object_detection_pytorch.py example with Torch 1.12.0 (see my post in Discuss Forum: [frontend][pytorch] TVM compatibility with Torch 1.12.0). In this example, Mask R-CNN model is used. After some debugging, I have found out that execution was failing when parsing these two lines:
/opt/homebrew/Caskroom/miniforge/base/envs/tvm_test/lib/python3.8/site-packages/torchvision/models/detection/anchor_utils.py:121:0
%2004 : Long(requires_grad=0, device=cpu) = aten::div(%1999, %other.1, %43), scope:
__module.model/__module.model.rpn/__module.model.rpn.anchor_generator #
/opt/homebrew/Caskroom/miniforge/base/envs/tvm_test/lib/python3.8/site-packages/torch/_tensor.py:674:0
%stride_height.1 : Long(requires_grad=0, device=cpu) = aten::fill_(%2003, %2004), scope:
__module.model/__module.model.rpn/__module.model.rpn.anchor_generator #
This is why I have prepared a simple unit test example that highlight this problem. The problem is present when one uses Torch 1.11.0 as well.
Expected behavior
Unit tests to run successfully.
Pytorch aten::fill_()
operator consumes 0-size value tensor in all three tests. Pytorch and TVM output of this operator has to be identical.
Actual behavior
The first two tests (tests are listed in "Steps to reproduce") works fine. However, the third tests fail with an error:
=================================== FAILURES ===================================
______________________________ test_fill_with_div ______________________________
tests/python/frontend/pytorch/test_forward.py:276: in test_fill_with_div
verify_model_with_input(test_func, [torch.rand([1, 3, 10, 10]).float()])
tests/python/frontend/pytorch/test_forward.py:242: in verify_model_with_input
mod, params = relay.frontend.from_pytorch(trace, input_shapes, custom_convert_map)
python/tvm/relay/frontend/pytorch.py:4564: in from_pytorch
outputs = converter.convert_operators(_get_operator_nodes(graph.nodes()), outputs, ret_name)
python/tvm/relay/frontend/pytorch.py:3938: in convert_operators
relay_out = relay_op(
python/tvm/relay/frontend/pytorch.py:812: in fill_
return self.full_impl(self.infer_shape(data), fill_value, input_types[0])
python/tvm/relay/frontend/pytorch.py:679: in full_impl
out = _op.full(_expr.const(fill_value, dtype=dtype), size, dtype=dtype)
python/tvm/relay/expr.py:517: in const
raise ValueError("value has to be scalar or NDArray")
E ValueError: value has to be scalar or NDArray
----------------------------- Captured stderr call -----------------------------
Environment
TVM version: v0.10.dev (c83ee08c102f3) Architecture: arm64 System Version: macOS 12.5.1 (21G83) Kernel Version: Darwin 21.6.0
Steps to reproduce
Add tests listed below to the tests/python/frontend/pytorch/test_forward.py
script and run it.
# works fine
def test_fill():
def test_func(x):
return x.fill_(3)
verify_model_with_input(test_func, [torch.rand([1, 3, 10, 10]).float()])
# works fine
def test_fill_zero_dim_value_tensor():
def test_func(x):
return x.fill_(torch.tensor(3))
verify_model_with_input(test_func, [torch.rand([1, 3, 10, 10]).float()])
# FAILURE
def test_fill_with_div():
def test_func(x):
y = torch.div(torch.tensor(6.), torch.tensor(2.))
return x.fill_(y)
verify_model_with_input(test_func, [torch.rand([1, 3, 10, 10]).float()])
We just need to add constant folding on the fill value. See this commit https://github.com/masahi/tvm/commit/3e88280ec3a0b943d8aac76a7a99f75ffd0ac863.
Can you make a PR with your test case?
Thank you a lot, @masahi. I have created PR with my test case: https://github.com/apache/tvm/pull/12857