mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

[Bug] RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.

Open AlfaRomeo9527 opened this issue 2 years ago • 4 comments

Checklist

  • [x] I have searched related issues but cannot get the expected help.
  • [x] 2. I have read the FAQ documentation but cannot get the expected help.
  • [ ] 3. The bug has not been fixed in the latest version.

Describe the bug

when i was convert a xx.pth file to tensorrt.engine, use this script python3 tools/deploy.py F:\workspace\mmdeploy\configs\mmseg\segmentation_tensorrt_static-512x512.py F:\workspace\mmsegmentation\configs\segmenter\segmenter_vit-s_mask_8x1_512x512_160k_ade20k.py F:\workspace\mmsegmentation\checkpoints\segmenter_vit-b_mask_8x1_512x512_160k_cancerroi\segmenter_vit-s_mask_8x1_512x512_160k_ade20k_20220105_151706-511bb103.pth 512.jpg --work-dir mmdeploy_model/Segmentor_ade20k --device cuda --dump-info

RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.

Reproduction

python3 tools/deploy.py F:\workspace\mmdeploy\configs\mmseg\segmentation_tensorrt_static-512x512.py F:\workspace\mmsegmentation\configs\segmenter\segmenter_vit-s_mask_8x1_512x512_160k_ade20k.py F:\workspace\mmsegmentation\checkpoints\segmenter_vit-b_mask_8x1_512x512_160k_cancerroi\segmenter_vit-s_mask_8x1_512x512_160k_ade20k_20220105_151706-511bb103.pth 512.jpg --work-dir mmdeploy_model/Segmentor_ade20k --device cuda --dump-info

Environment

2022-10-21 17:36:32,179 - mmdeploy - INFO - sys.platform: win32
2022-10-21 17:36:32,179 - mmdeploy - INFO - Python: 3.8.13 (default, Mar 28 2022, 06:59:08) [MSC v.1916 64 bit (AMD64)]
2022-10-21 17:36:32,179 - mmdeploy - INFO - CUDA available: True
2022-10-21 17:36:32,179 - mmdeploy - INFO - GPU 0: NVIDIA GeForce RTX 3090
2022-10-21 17:36:32,179 - mmdeploy - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1
2022-10-21 17:36:32,179 - mmdeploy - INFO - NVCC: Not Available
2022-10-21 17:36:32,179 - mmdeploy - INFO - GCC: n/a
2022-10-21 17:36:32,179 - mmdeploy - INFO - PyTorch: 1.9.1+cu111
2022-10-21 17:36:32,180 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 2019
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.0.5
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=C:/w/b/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/w/b/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, 

2022-10-21 17:36:32,180 - mmdeploy - INFO - TorchVision: 0.10.1+cu111
2022-10-21 17:36:32,180 - mmdeploy - INFO - OpenCV: 4.6.0
2022-10-21 17:36:32,180 - mmdeploy - INFO - MMCV: 1.4.0
2022-10-21 17:36:32,180 - mmdeploy - INFO - MMCV Compiler: MSVC 192930146
2022-10-21 17:36:32,180 - mmdeploy - INFO - MMCV CUDA Compiler: 11.1
2022-10-21 17:36:32,180 - mmdeploy - INFO - MMDeploy: 0.8.0+47d4e6f
2022-10-21 17:36:32,180 - mmdeploy - INFO - 

2022-10-21 17:36:32,180 - mmdeploy - INFO - **********Backend information**********
2022-10-21 17:36:33,162 - mmdeploy - INFO - onnxruntime: 1.8.1	ops_is_avaliable : True
2022-10-21 17:36:33,217 - mmdeploy - INFO - tensorrt: 8.2.3.0	ops_is_avaliable : True
2022-10-21 17:36:33,290 - mmdeploy - INFO - ncnn: None	ops_is_avaliable : False
2022-10-21 17:36:33,294 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-10-21 17:36:33,298 - mmdeploy - INFO - openvino_is_avaliable: False
2022-10-21 17:36:33,361 - mmdeploy - INFO - snpe_is_available: False
2022-10-21 17:36:33,361 - mmdeploy - INFO - 

2022-10-21 17:36:33,361 - mmdeploy - INFO - **********Codebase information**********
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmdet:	2.25.1
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmseg:	0.27.0
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmcls:	None
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmocr:	None
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmedit:	None
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmdet3d:	None
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmpose:	None
2022-10-21 17:36:33,367 - mmdeploy - INFO - mmrotate:	None

Error traceback

C:\Users\DELL\.conda\envs\mmdeploy\python.exe F:/workspace/mmdeploy/tools/deploy.py F:\workspace\mmdeploy\configs\mmseg\segmentation_tensorrt_static-512x512.py F:\workspace\mmsegmentation\configs\segmenter\segmenter_vit-s_mask_8x1_512x512_160k_ade20k.py F:\workspace\mmsegmentation\checkpoints\segmenter_vit-b_mask_8x1_512x512_160k_cancerroi\segmenter_vit-s_mask_8x1_512x512_160k_ade20k_20220105_151706-511bb103.pth 512.jpg --work-dir mmdeploy_model/Segmentor_ade20k --device cuda --dump-info 
2022-10-21 17:37:10,577 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
F:\workspace\mmsegmentation\mmseg\models\losses\cross_entropy_loss.py:235: UserWarning: Default ``avg_non_ignore`` is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set ``avg_non_ignore=True``.
  warnings.warn(
load checkpoint from local path: F:\workspace\mmsegmentation\checkpoints\segmenter_vit-b_mask_8x1_512x512_160k_cancerroi\segmenter_vit-s_mask_8x1_512x512_160k_ade20k_20220105_151706-511bb103.pth
2022-10-21 17:37:16,006 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
2022-10-21 17:37:16,006 - mmdeploy - INFO - Export PyTorch model to ONNX: mmdeploy_model/Segmentor_ade20k\end2end.onnx.
2022-10-21 17:37:16,134 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied
C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\codebase\mmseg\models\segmentors\base.py:39: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  img_shape = [int(val) for val in img_shape]
F:\workspace\mmsegmentation\mmseg\models\utils\embed.py:62: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  output_h = math.ceil(input_h / stride_h)
F:\workspace\mmsegmentation\mmseg\models\utils\embed.py:63: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  output_w = math.ceil(input_w / stride_w)
F:\workspace\mmsegmentation\mmseg\models\utils\embed.py:64: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  pad_h = max((output_h - 1) * stride_h +
F:\workspace\mmsegmentation\mmseg\models\utils\embed.py:66: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  pad_w = max((output_w - 1) * stride_w +
F:\workspace\mmsegmentation\mmseg\models\utils\embed.py:72: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if pad_h > 0 or pad_w > 0:
F:\workspace\mmsegmentation\mmseg\models\backbones\vit.py:356: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if x_len != pos_len:
C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\pytorch\functions\multi_head_attention_forward.py:23: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  q = q / math.sqrt(E)
Process Process-2:
Traceback (most recent call last):
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\pytorch2onnx.py", line 96, in torch2onnx
    export(
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\onnx\export.py", line 122, in export
    torch.onnx.export(
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\__init__.py", line 275, in export
    return utils.export(model, args, f, export_params, verbose, training,
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\utils.py", line 88, in export
    _export(model, args, f, export_params, verbose, training, input_names, output_names,
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\utils.py", line 689, in _export
    _model_to_graph(model, args, verbose, input_names,
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\core\rewriters\rewriter_utils.py", line 379, in wrapper
    return self.func(self, *args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\mmdeploy\apis\onnx\optimizer.py", line 10, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\utils.py", line 463, in _model_to_graph
    graph = _optimize_graph(graph, operator_export_type,
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\utils.py", line 200, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\__init__.py", line 313, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\utils.py", line 994, in _run_symbolic_function
    return symbolic_fn(g, *inputs, **attrs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\symbolic_helper.py", line 172, in wrapper
    return fn(g, *args, **kwargs)
  File "C:\Users\DELL\.conda\envs\mmdeploy\lib\site-packages\torch\onnx\symbolic_opset9.py", line 546, in transpose
    raise RuntimeError('Unsupported: ONNX export of transpose for tensor '
RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.
2022-10-21 17:37:20,257 - mmdeploy - ERROR - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.

Process finished with exit code 1

AlfaRomeo9527 avatar Oct 21 '22 09:10 AlfaRomeo9527

@AlfaRomeo9527 Hi, it seems that you've trained on a new dataset.

  1. Could you post here what changes you done on mmseg?
  2. Try convert the segmentor with config and ckpt from mmseg repo and see if it succeeds.
  3. Besides, you could view the onnx model with netron and check the failed node name of Transpose.

RunningLeon avatar Oct 24 '22 03:10 RunningLeon

1、I modified some of the parameters on my new dataset.Error reported during conversion. 2、Therefore, I tried to download the model directly from mmseg repo and directly transform the model without modifying any parameters, but the error was still reported. 3、

AlfaRomeo9527 avatar Oct 25 '22 03:10 AlfaRomeo9527

@AlfaRomeo9527 Could you try with pytorch==1.8.0, it works fine on my side.

RunningLeon avatar Oct 26 '22 03:10 RunningLeon

I also solve it with pytorch==1.8.0,I don't know the reason

wuwulin avatar Jul 24 '23 09:07 wuwulin