mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

Error occurred while converting SATRN model

Open Dream-Zhou opened this issue 1 year ago • 3 comments

I tried to convert the official SATRN model to Tensorrt Engine, but it failed, and the conversion process was very slow. Actually, I have not modified the deploy config file and model config file. And I have tried the way described in the documentation(https://github.com/open-mmlab/mmdeploy/blob/master/docs/en/experimental/onnx_optimizer.md), but it has not solved the problem.

Command: python tools/deploy.py configs/mmocr/text-recognition/text-recognition_tensorrt_static-1x32x32.py /home/dataCenter/train-tool/deploy/mmocr/configs/textrecog/satrn/satrn_academic.py /home/dataCenter/train-tool/mmocr/checkpoints/satrn_academic_20211009-cb8b1580.pth ~/ann_18.bmp --work-dir ~/207.deploy.080102 --device cuda:0 --dump-info

Result: 2022-08-01 18:44:02,417 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess load checkpoint from local path: /home/dataCenter/train-tool/mmocr/checkpoints/satrn_academic_20211009-cb8b1580.pth 2022-08-01 18:44:17,281 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 2022-08-01 18:44:17,281 - mmdeploy - INFO - Export PyTorch model to ONNX: ~/207.deploy.080102/end2end.onnx. /home/dataCenter/train-tool/deploy/MMDeploy/mmdeploy/codebase/mmocr/models/text_recognition/base.py:51: TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). img_shape = [int(val) for val in img_shape] /home/dataCenter/train-tool/deploy/MMDeploy/mmdeploy/codebase/mmocr/models/text_recognition/base.py:51: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! img_shape = [int(val) for val in img_shape] /home/dataCenter/train-tool/deploy/mmocr/mmocr/models/textrecog/encoders/satrn_encoder.py:75: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! valid_width = min(w, math.ceil(w * valid_ratio)) /home/dataCenter/train-tool/deploy/mmocr/mmocr/models/textrecog/encoders/satrn_encoder.py:75: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! valid_width = min(w, math.ceil(w * valid_ratio)) /home/dataCenter/train-tool/deploy/mmocr/mmocr/models/textrecog/decoders/nrtr_decoder.py:126: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! valid_width = min(T, math.ceil(T * valid_ratio)) /home/dataCenter/train-tool/deploy/mmocr/mmocr/models/textrecog/decoders/nrtr_decoder.py:126: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! valid_width = min(T, math.ceil(T * valid_ratio)) /home/isee/anaconda3/envs/deploy/lib/python3.7/site-packages/torch/onnx/symbolic_helper.py:325: UserWarning: Type cannot be inferred, which might cause exported graph to produce incorrect results. warnings.warn("Type cannot be inferred, which might cause exported graph to produce incorrect results.") 2022-08-01 19:00:01,571 - mmdeploy - INFO - Execute onnx optimize passes. 2022-08-01 19:00:01,691 - mmdeploy - WARNING - Can not optimize model, please build torchscipt extension. More details: https://github.com/open-mmlab/mmdeploy/blob/master/docs/en/experimental/onnx_optimizer.md 2022-08-01 19:00:07,659 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx 2022-08-01 19:00:10,790 - mmdeploy - INFO - Start pipeline mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt in subprocess 2022-08-01 19:00:11,209 - mmdeploy - INFO - Successfully loaded tensorrt plugins from /home/dataCenter/train-tool/deploy/MMDeploy/mmdeploy/lib/libmmdeploy_tensorrt_ops.so [TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +455, GPU +0, now: CPU 533, GPU 416 (MiB) Process Process-3: Traceback (most recent call last): File "/home/isee/anaconda3/envs/deploy/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/isee/anaconda3/envs/deploy/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/home/dataCenter/train-tool/deploy/MMDeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in call ret = func(*args, **kwargs) File "/home/dataCenter/train-tool/deploy/MMDeploy/mmdeploy/backend/tensorrt/onnx2tensorrt.py", line 88, in onnx2tensorrt device_id=device_id) File "/home/dataCenter/train-tool/deploy/MMDeploy/mmdeploy/backend/tensorrt/utils.py", line 109, in from_onnx if not parser.parse(onnx_model.SerializeToString()): ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2680668146 2022-08-01 19:00:13,629 - mmdeploy - ERROR - mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt with Call id: 1 failed. exit.

Environment: 2022-08-01 19:45:10,980 - mmdeploy - INFO -

2022-08-01 19:45:10,980 - mmdeploy - INFO - Environmental information 2022-08-01 19:45:13,094 - mmdeploy - INFO - sys.platform: linux 2022-08-01 19:45:13,095 - mmdeploy - INFO - Python: 3.7.13 (default, Mar 29 2022, 02:18:16) [GCC 7.5.0] 2022-08-01 19:45:13,095 - mmdeploy - INFO - CUDA available: True 2022-08-01 19:45:13,095 - mmdeploy - INFO - GPU 0: NVIDIA GeForce RTX 3070 2022-08-01 19:45:13,095 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda-11.3 2022-08-01 19:45:13,095 - mmdeploy - INFO - NVCC: Build cuda_11.3.r11.3/compiler.29920130_0 2022-08-01 19:45:13,095 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 2022-08-01 19:45:13,095 - mmdeploy - INFO - PyTorch: 1.10.2 2022-08-01 19:45:13,096 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: NO AVX
  • CUDA Runtime 11.3
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.2
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

2022-08-01 19:45:13,096 - mmdeploy - INFO - TorchVision: 0.11.3 2022-08-01 19:45:13,096 - mmdeploy - INFO - OpenCV: 4.5.4 2022-08-01 19:45:13,096 - mmdeploy - INFO - MMCV: 1.4.8 2022-08-01 19:45:13,096 - mmdeploy - INFO - MMCV Compiler: GCC 9.4 2022-08-01 19:45:13,096 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3 2022-08-01 19:45:13,096 - mmdeploy - INFO - MMDeploy: 0.6.0+394fb55 2022-08-01 19:45:13,096 - mmdeploy - INFO -

2022-08-01 19:45:13,096 - mmdeploy - INFO - Backend information 2022-08-01 19:45:14,385 - mmdeploy - INFO - onnxruntime: None ops_is_avaliable : False 2022-08-01 19:45:14,511 - mmdeploy - INFO - tensorrt: 8.0.0.3 ops_is_avaliable : True 2022-08-01 19:45:15,248 - mmdeploy - INFO - ncnn: None ops_is_avaliable : False 2022-08-01 19:45:15,436 - mmdeploy - INFO - pplnn_is_avaliable: False 2022-08-01 19:45:15,612 - mmdeploy - INFO - openvino_is_avaliable: False 2022-08-01 19:45:15,612 - mmdeploy - INFO -

2022-08-01 19:45:15,612 - mmdeploy - INFO - Codebase information 2022-08-01 19:45:15,660 - mmdeploy - INFO - mmdet: 2.20.0 2022-08-01 19:45:15,660 - mmdeploy - INFO - mmseg: 0.23.0 2022-08-01 19:45:15,660 - mmdeploy - INFO - mmcls: 0.22.1 2022-08-01 19:45:15,661 - mmdeploy - INFO - mmocr: 0.4.1 2022-08-01 19:45:15,661 - mmdeploy - INFO - mmedit: None 2022-08-01 19:45:15,661 - mmdeploy - INFO - mmdet3d: None 2022-08-01 19:45:15,661 - mmdeploy - INFO - mmpose: None 2022-08-01 19:45:15,661 - mmdeploy - INFO - mmrotate: None

Dream-Zhou avatar Aug 01 '22 11:08 Dream-Zhou

Satrn accepts 3-channel inputs. Please use configs/mmocr/text-recognition/text-recognition_tensorrt_static-32x32.py instead.

AllentDan avatar Aug 02 '22 00:08 AllentDan

Thanks for your reply. I used the 3-channel config file instead, but got the same error.

Dream-Zhou avatar Aug 02 '22 02:08 Dream-Zhou

Oh, please use satrn_small instead. 2GB is the limit size of ONNX protobuf.

AllentDan avatar Aug 02 '22 02:08 AllentDan

Oh, please use satrn_small instead. 2GB is the limit size of ONNX protobuf.

@AllentDan Is there any way we can solve this?

huliang2016 avatar Sep 29 '22 09:09 huliang2016

It is ONNX itself that does not support model with size great than 2GB by far.

AllentDan avatar Sep 29 '22 09:09 AllentDan