mmdeploy
mmdeploy copied to clipboard
[Bug] DeformConv2dFunction is not exportable to ONNX IR when padding is int
Checklist
- [X] I have searched related issues but cannot get the expected help.
- [X] 2. I have read the FAQ documentation but cannot get the expected help.
- [X] 3. The bug has not been fixed in the latest version.
Describe the bug
The type of padding
of the DeformConv2dFunction
is defined to be Union[int, Tuple[int, ...]]
https://github.com/open-mmlab/mmcv/blob/d9e10e11846d911e8354cd024967d3a17a88083c/mmcv/ops/deform_conv.py#L77
But the symbolic rewriter expects the padding to be a pair
(works only with the Tuple
).
https://github.com/open-mmlab/mmdeploy/blob/bc75c9d6c8940aa03d0e1e5b5962bd930478ba77/mmdeploy/mmcv/ops/deform_conv.py#L25
As a results, when trying to export the tood model, I'm getting the following exception:
...
qi-inference-module | File "/venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1708, in _run_symbolic_method
qi-inference-module | return symbolic_fn(graph_context, *args)
qi-inference-module | File "/venv/lib/python3.10/site-packages/mmdeploy/mmcv/ops/deform_conv.py", line 25, in deform_conv__default
qi-inference-module | padding_i=[p for pair in zip(padding, padding) for p in pair],
qi-inference-module | TypeError: 'int' object is not iterable (occurred when translating DeformConv2dFunction)
qi-inference-module | 02/15 06:58:11 - mmengine - ERROR - /venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
Because the deform_conv2d
is called with int
padding parameter.
https://github.com/open-mmlab/mmdetection/blob/cfd5d3a985b0249de009b67d04f37263e11cdf3d/mmdet/models/dense_heads/tood_head.py#L313
Reproduction
We integrate mmdeploy and mmdetection in our proprietary codebase, so I cannot post any code here.
We just call the torch2onnx
from mmdeploy.apis
.
Environment
The environment is proprietary.
Error traceback
02/15 06:58:03 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
02/15 06:58:03 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
Loads checkpoint by local backend from path: /tmp/tmp8ong2sd5/mmdet_checkpoint.pth
02/15 06:58:05 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
02/15 06:58:05 - mmengine - INFO - Export PyTorch model to ONNX: /app/models/tood/end2end.onnx.
02/15 06:58:05 - mmengine - WARNING - Can not find torch.nn.functional._scaled_dot_product_attention, function rewrite will not be applied
02/15 06:58:05 - mmengine - WARNING - Can not find mmdet.models.utils.transformer.PatchMerging.forward, function rewrite will not be applied
/venv/lib/python3.10/site-packages/mmdeploy/codebase/mmdet/models/detectors/single_stage.py:80: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
img_shape = [int(val) for val in img_shape]
/venv/lib/python3.10/site-packages/mmdeploy/codebase/mmdet/models/detectors/single_stage.py:80: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
img_shape = [int(val) for val in img_shape]
/venv/lib/python3.10/site-packages/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
ys_shape = tuple(int(s) for s in ys.shape)
/venv/lib/python3.10/site-packages/mmcv/ops/deform_conv.py:218: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if not all(map(lambda s: s > 0, output_size)):
/venv/lib/python3.10/site-packages/mmcv/ops/deform_conv.py:114: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
int(i)
/venv/lib/python3.10/site-packages/mmcv/ops/deform_conv.py:120: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
cur_im2col_step = min(ctx.im2col_step, input.size(0))
/venv/lib/python3.10/site-packages/mmcv/ops/deform_conv.py:121: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert (input.size(0) % cur_im2col_step
/venv/lib/python3.10/site-packages/mmdeploy/codebase/mmdet/models/dense_heads/base_dense_head.py:109: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert cls_score.size()[-2:] == bbox_pred.size()[-2:]
/venv/lib/python3.10/site-packages/mmdeploy/pytorch/functions/topk.py:58: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if k > size:
/venv/lib/python3.10/site-packages/mmdeploy/codebase/mmdet/models/task_modules/coders/delta_xywh_bbox_coder.py:38: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert pred_bboxes.size(0) == bboxes.size(0)
/venv/lib/python3.10/site-packages/mmdeploy/codebase/mmdet/models/task_modules/coders/delta_xywh_bbox_coder.py:40: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert pred_bboxes.size(1) == bboxes.size(1)
/venv/lib/python3.10/site-packages/mmdeploy/mmcv/ops/nms.py:474: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
int(scores.shape[-1]),
/venv/lib/python3.10/site-packages/mmdeploy/mmcv/ops/nms.py:148: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
out_boxes = min(num_boxes, after_topk)
[W shape_type_inference.cpp:1920] Warning: The shape inference of mmdeploy::GridPriorsTRT type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable)
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 1 WARNING 0 ERROR ========================
1 WARNING were not printed due to the log level.
Process Process-1:2:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
ret = func(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/pytorch2onnx.py", line 98, in torch2onnx
export(
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
return self.call_function(func_name_, *args, **kwargs)
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
return self.call_function_local(func_name, *args, **kwargs)
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
return pipe_caller(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
ret = func(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/onnx/export.py", line 138, in export
torch.onnx.export(
File "/venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 506, in export
_export(
File "/venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1548, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/venv/lib/python3.10/site-packages/mmdeploy/apis/onnx/optimizer.py", line 27, in model_to_graph__custom_optimizer
graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1117, in _model_to_graph
graph = _optimize_graph(
File "/venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 665, in _optimize_graph
graph = _C._jit_pass_onnx(graph, operator_export_type)
File "/venv/lib/python3.10/site-packages/torch/onnx/utils.py", line 1708, in _run_symbolic_method
return symbolic_fn(graph_context, *args)
File "/venv/lib/python3.10/site-packages/mmdeploy/mmcv/ops/deform_conv.py", line 25, in deform_conv__default
padding_i=[p for pair in zip(padding, padding) for p in pair],
TypeError: 'int' object is not iterable (occurred when translating DeformConv2dFunction)
02/15 06:58:11 - mmengine - ERROR - /venv/lib/python3.10/site-packages/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.