mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

SAR model with batching error

Open Phelan164 opened this issue 1 year ago • 1 comments

Thanks for your bug report. We appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug The exported onnx SAR model not running with batching

Reproduction

I export model SAR to onnx python -m tools.deploy configs/mmocr/text-recognition/text-recognition_onnxruntime_dynamic.py /Users/tamnguyen/Desktop/working/src/mmocr/configs/textrecog/sar/sar_r31_sequential_decoder_academic.py ~/Downloads/sar_r31_sequential_decoder_academic-d06c9a8e.pth ~/Desktop/the.png --work-dir models/textrecog/sar/ --dump-info

When inference the model with batching (2 images) the error happened Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running LSTM node. Name:'LSTM_142' Status Message: Input initial_h must have shape {1,2,512}. Actual:{1,1,512}

It is run correctly with a single image.

BTW, when exporting the model dynamic_axes is used dynamic_axes {'input': {0: 'batch', 3: 'width'}, 'output': {0: 'batch', 1: 'seq_len', 2: 'num_classes'}}

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

Environment

  1. Please run python tools/check_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)
2022-08-08 10:59:51,830 - mmdeploy - INFO - 

2022-08-08 10:59:51,831 - mmdeploy - INFO - **********Environmental information**********
2022-08-08 10:59:52,085 - mmdeploy - INFO - sys.platform: darwin
2022-08-08 10:59:52,086 - mmdeploy - INFO - Python: 3.8.13 (default, Mar 28 2022, 06:16:26) [Clang 12.0.0 ]
2022-08-08 10:59:52,086 - mmdeploy - INFO - CUDA available: False
2022-08-08 10:59:52,086 - mmdeploy - INFO - GCC: Apple clang version 13.1.6 (clang-1316.0.21.2.5)
2022-08-08 10:59:52,086 - mmdeploy - INFO - PyTorch: 1.10.2
2022-08-08 10:59:52,086 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 4.2
  - C++ Version: 201402
  - clang 12.0.0
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=/Applications/Xcode-12.0.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -Wno-deprecated-declarations -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -fcolor-diagnostics -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-unused-private-field -Wno-missing-braces -Wno-c++14-extensions -Wno-constexpr-not-const, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.2, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=OFF, 

2022-08-08 10:59:52,086 - mmdeploy - INFO - TorchVision: 0.11.3
2022-08-08 10:59:52,086 - mmdeploy - INFO - OpenCV: 4.6.0
2022-08-08 10:59:52,086 - mmdeploy - INFO - MMCV: 1.6.1
2022-08-08 10:59:52,086 - mmdeploy - INFO - MMCV Compiler: clang 13.1.6
2022-08-08 10:59:52,086 - mmdeploy - INFO - MMCV CUDA Compiler: not available
2022-08-08 10:59:52,086 - mmdeploy - INFO - MMDeploy: 0.7.0+f957284
2022-08-08 10:59:52,086 - mmdeploy - INFO - 

2022-08-08 10:59:52,086 - mmdeploy - INFO - **********Backend information**********
2022-08-08 10:59:52,432 - mmdeploy - INFO - onnxruntime: 1.12.0	ops_is_avaliable : False
2022-08-08 10:59:52,434 - mmdeploy - INFO - tensorrt: None	ops_is_avaliable : False
2022-08-08 10:59:52,447 - mmdeploy - INFO - ncnn: None	ops_is_avaliable : False
2022-08-08 10:59:52,449 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-08-08 10:59:52,450 - mmdeploy - INFO - openvino_is_avaliable: False
2022-08-08 10:59:52,462 - mmdeploy - INFO - snpe_is_available: False
2022-08-08 10:59:52,462 - mmdeploy - INFO - 

2022-08-08 10:59:52,462 - mmdeploy - INFO - **********Codebase information**********
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmdet:	2.25.1
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmseg:	None
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmcls:	None
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmocr:	0.6.0
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmedit:	None
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmdet3d:	None
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmpose:	None
2022-08-08 10:59:52,463 - mmdeploy - INFO - mmrotate:	None

Error traceback

If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix

If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Phelan164 avatar Aug 08 '22 04:08 Phelan164

Hi, please refer to the closed issue in MMOCR here. MMOCR recog models do not fully support dynamic batch inference because of valid_ratios.

AllentDan avatar Aug 08 '22 07:08 AllentDan

Closing it for no activity for a long time, feel free to reopen it if it is still an issue for you.

AllentDan avatar Sep 29 '22 09:09 AllentDan