mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

Failed to convert DETR model from pytorch to onnx on cpu device

Open windmakeppcool opened this issue 3 years ago • 2 comments

I use below command to try convert DETR model from pytorch to onnx.

python ./tools/deploy.py \
    config/detection_onnxruntime_static.py \
    config/detr_r50_8x2_150e_coco.py \
    ckpts/detr_r50_8x2_150e_coco_20201130_194835-2c4b8974.pth \
    demo/demo.jpg \
    --work-dir work_dirs \
    --work-dir work_dirs \
    -device cpu

but there is a report error show as below

[2022-09-06 16:52:51.147] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel'
[2022-09-06 16:52:52.504] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel'
/home/liangly/open-mmlab/mmdetection/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  warnings.warn(
[2022-09-06 16:52:54.835] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel'
2022-09-06 16:52:54,839 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
load checkpoint from local path: /home/liangly/open-mmlab/depoly/ckpts/detr_r50_8x2_150e_coco_20201130_194835-2c4b8974.pth
/home/liangly/open-mmlab/mmdetection/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  warnings.warn(
2022-09-06 16:52:56,204 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
2022-09-06 16:52:56,205 - mmdeploy - INFO - Export PyTorch model to ONNX: /home/liangly/open-mmlab/depoly/work_dirs/end2end.onnx.
/home/liangly/open-mmlab/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  ys_shape = tuple(int(s) for s in ys.shape)
/home/liangly/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/models/detectors/base.py:24: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  img_shape = [int(val) for val in img_shape]
/home/liangly/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/models/detectors/base.py:24: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  img_shape = [int(val) for val in img_shape]
/home/liangly/open-mmlab/mmdetection/mmdet/models/utils/positional_encoding.py:81: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  dim_t = self.temperature**(2 * (dim_t // 2) / self.num_feats)
/home/liangly/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:28: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  k = torch.tensor(k, device=input.device, dtype=torch.long)
/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/symbolic_helper.py:325: UserWarning: Type cannot be inferred, which might cause exported graph to produce incorrect results.
  warnings.warn("Type cannot be inferred, which might cause exported graph to produce incorrect results.")
/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/symbolic_opset9.py:2815: UserWarning: Exporting aten::index operator of advanced indexing in opset 13 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
  warnings.warn("Exporting aten::index operator of advanced indexing in opset " +
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[950, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[950, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[950, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[950, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[950, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[100, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[100, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[100, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[100, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[100, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: shape '[100, 1, 1, 950]' is invalid for input of size 950 (function ComputeConstantFolding)
Process Process-2:
Traceback (most recent call last):
  File "/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/pytorch2onnx.py", line 96, in torch2onnx
    export(
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/onnx/export.py", line 122, in export
    torch.onnx.export(
  File "/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/__init__.py", line 316, in export
    return utils.export(model, args, f, export_params, verbose, training,
  File "/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/utils.py", line 107, in export
    _export(model, args, f, export_params, verbose, training, input_names, output_names,
  File "/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/utils.py", line 724, in _export
    _model_to_graph(model, args, verbose, input_names,
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/core/rewriters/rewriter_utils.py", line 379, in wrapper
    return self.func(self, *args, **kwargs)
  File "/home/liangly/open-mmlab/mmdeploy/mmdeploy/apis/onnx/optimizer.py", line 10, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "/home/liangly/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/onnx/utils.py", line 544, in _model_to_graph
    params_dict = torch._C._jit_pass_onnx_constant_fold(graph, params_dict,
RuntimeError: shape '[950, 1, 1, 950]' is invalid for input of size 950
2022-09-06 16:53:09,080 - mmdeploy - ERROR - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.

Why the reason of errors?How can I correct convert DETR model

windmakeppcool avatar Sep 06 '22 09:09 windmakeppcool

My env list below

2022-09-06 17:13:12,242 - mmdeploy - INFO - 

2022-09-06 17:13:12,242 - mmdeploy - INFO - **********Environmental information**********
2022-09-06 17:13:12,446 - mmdeploy - INFO - sys.platform: linux
2022-09-06 17:13:12,447 - mmdeploy - INFO - Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
2022-09-06 17:13:12,447 - mmdeploy - INFO - CUDA available: True
2022-09-06 17:13:12,447 - mmdeploy - INFO - GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
2022-09-06 17:13:12,447 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2022-09-06 17:13:12,447 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.1, V11.1.74
2022-09-06 17:13:12,447 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
2022-09-06 17:13:12,447 - mmdeploy - INFO - PyTorch: 1.10.1+cu111
2022-09-06 17:13:12,447 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

2022-09-06 17:13:12,447 - mmdeploy - INFO - TorchVision: 0.11.2+cu111
2022-09-06 17:13:12,447 - mmdeploy - INFO - OpenCV: 4.6.0
2022-09-06 17:13:12,447 - mmdeploy - INFO - MMCV: 1.6.0
2022-09-06 17:13:12,447 - mmdeploy - INFO - MMCV Compiler: GCC 9.4
2022-09-06 17:13:12,447 - mmdeploy - INFO - MMCV CUDA Compiler: 11.1
2022-09-06 17:13:12,447 - mmdeploy - INFO - MMDeploy: 0.7.0+b602356
2022-09-06 17:13:12,447 - mmdeploy - INFO - 

2022-09-06 17:13:12,447 - mmdeploy - INFO - **********Backend information**********
2022-09-06 17:13:12,781 - mmdeploy - INFO - onnxruntime: 1.8.1  ops_is_avaliable : False
2022-09-06 17:13:12,783 - mmdeploy - INFO - tensorrt: None      ops_is_avaliable : False
2022-09-06 17:13:12,798 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
2022-09-06 17:13:12,800 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-09-06 17:13:12,802 - mmdeploy - INFO - openvino_is_avaliable: False
2022-09-06 17:13:12,816 - mmdeploy - INFO - snpe_is_available: False
2022-09-06 17:13:12,818 - mmdeploy - INFO - ascend_is_available: False
2022-09-06 17:13:12,820 - mmdeploy - INFO - coreml_is_available: False
2022-09-06 17:13:12,820 - mmdeploy - INFO - 

2022-09-06 17:13:12,820 - mmdeploy - INFO - **********Codebase information**********
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmdet:      2.25.1
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmseg:      0.26.0
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmcls:      0.23.1
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmocr:      None
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmedit:     None
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmdet3d:    1.0.0rc3
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmpose:     None
2022-09-06 17:13:12,823 - mmdeploy - INFO - mmrotate:   None

windmakeppcool avatar Sep 06 '22 09:09 windmakeppcool

I see. We will fix it ASAP. detection_onnxruntime_dynamic.py works now. You can use this config to convert your model.

grimoire avatar Sep 06 '22 11:09 grimoire