mmdeploy
mmdeploy copied to clipboard
[Bug] KNet conversion to ONNX faliure
Checklist
- [X] I have searched related issues but cannot get the expected help.
- [X] 2. I have read the FAQ documentation but cannot get the expected help.
- [X] 3. The bug has not been fixed in the latest version.
Describe the bug
I used this command to try and convert my trained KNet model to ONNX.
I get this error : "torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::einsum' to ONNX opset version 11 is not supported. Support for this operator was added in version 12, try exporting with this version."
When I changed the ONNX opset to 12 I get this error "torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::unflatten' to ONNX opset version 12 is not supported. " I keep getting this error for all ONNX opsets from 12-18.
I aslo keep getting these warnings before the error occurs : "/home/ossome/rstream/mmdeploy/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! ys_shape = tuple(int(s) for s in ys.shape) /home/ossome/rstream/mmsegmentation/mmseg/models/utils/embed.py:62: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! output_h = math.ceil(input_h / stride_h) /home/ossome/rstream/mmsegmentation/mmseg/models/utils/embed.py:63: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! output_w = math.ceil(input_w / stride_w) /home/ossome/rstream/mmsegmentation/mmseg/models/utils/embed.py:64: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! pad_h = max((output_h - 1) * stride_h + /home/ossome/rstream/mmsegmentation/mmseg/models/utils/embed.py:66: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! "
Reproduction
This is the command I ran:
py3 tools/deploy.py configs/mmseg/segmentation_onnxruntime_dynamic.py /home/ossome/rstream/mmsegmentation/configs/knet/knet-s3_swin-l_upernet_8xb2-adamw-80k_dataset_256x256.py /home/ossome/rstream/mmsegmentation/work_dirs/dataset/best_mIoU_iter_37500.pth demo/resources/2.jpg --work-dir mmdeploy_models/mmseg/ort/ --show --device cuda:0
I had made some changes to the KNet config file. But they are just with regards to the dataset being used and some changes in data preprocessing in the training pipeline, and I assume that should not affect the conversion of the model to ONNX
Environment
12/07 14:33:30 - mmengine - INFO - **********Environmental information**********
/bin/sh: 1: /home/ossome/anaconda3/envs/mmsegmentation/bin/nvcc: not found
/bin/sh: 1: /home/ossome/anaconda3/envs/mmsegmentation/bin/nvcc: not found
12/07 14:33:33 - mmengine - INFO - sys.platform: linux
12/07 14:33:33 - mmengine - INFO - Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0]
12/07 14:33:33 - mmengine - INFO - CUDA available: True
12/07 14:33:33 - mmengine - INFO - numpy_random_seed: 2147483648
12/07 14:33:33 - mmengine - INFO - GPU 0: NVIDIA GeForce GTX 1050 Ti
12/07 14:33:33 - mmengine - INFO - CUDA_HOME: /home/ossome/anaconda3/envs/mmsegmentation
12/07 14:33:33 - mmengine - INFO - NVCC: Not Available
12/07 14:33:33 - mmengine - INFO - GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
12/07 14:33:33 - mmengine - INFO - PyTorch: 2.0.1
12/07 14:33:33 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
- GCC 9.3
- C++ Version: 201703
- Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.7
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.5
- Magma 2.6.1
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
12/07 14:33:33 - mmengine - INFO - TorchVision: 0.15.2
12/07 14:33:33 - mmengine - INFO - OpenCV: 4.8.0
12/07 14:33:33 - mmengine - INFO - MMEngine: 0.8.3
12/07 14:33:33 - mmengine - INFO - MMCV: 2.0.1
12/07 14:33:33 - mmengine - INFO - MMCV Compiler: GCC 9.3
12/07 14:33:33 - mmengine - INFO - MMCV CUDA Compiler: 11.7
12/07 14:33:33 - mmengine - INFO - MMDeploy: 1.3.0+660af62
12/07 14:33:33 - mmengine - INFO -
12/07 14:33:33 - mmengine - INFO - **********Backend information**********
12/07 14:33:33 - mmengine - INFO - tensorrt: None
12/07 14:33:33 - mmengine - INFO - ONNXRuntime: 1.8.1
12/07 14:33:33 - mmengine - INFO - ONNXRuntime-gpu: 1.8.1
12/07 14:33:33 - mmengine - INFO - ONNXRuntime custom ops: Available
12/07 14:33:33 - mmengine - INFO - pplnn: None
12/07 14:33:33 - mmengine - INFO - ncnn: None
12/07 14:33:33 - mmengine - INFO - snpe: None
12/07 14:33:33 - mmengine - INFO - openvino: None
12/07 14:33:33 - mmengine - INFO - torchscript: 2.0.1
12/07 14:33:33 - mmengine - INFO - torchscript custom ops: NotAvailable
12/07 14:33:33 - mmengine - INFO - rknn-toolkit: None
12/07 14:33:33 - mmengine - INFO - rknn-toolkit2: None
12/07 14:33:33 - mmengine - INFO - ascend: None
12/07 14:33:33 - mmengine - INFO - coreml: None
12/07 14:33:33 - mmengine - INFO - tvm: None
12/07 14:33:33 - mmengine - INFO - vacc: None
12/07 14:33:33 - mmengine - INFO -
12/07 14:33:33 - mmengine - INFO - **********Codebase information**********
12/07 14:33:33 - mmengine - INFO - mmdet: None
12/07 14:33:33 - mmengine - INFO - mmseg: 1.1.0
12/07 14:33:33 - mmengine - INFO - mmpretrain: 1.1.0
12/07 14:33:33 - mmengine - INFO - mmocr: None
12/07 14:33:33 - mmengine - INFO - mmagic: None
12/07 14:33:33 - mmengine - INFO - mmdet3d: None
12/07 14:33:33 - mmengine - INFO - mmpose: None
12/07 14:33:33 - mmengine - INFO - mmrotate: None
12/07 14:33:33 - mmengine - INFO - mmaction: None
12/07 14:33:33 - mmengine - INFO - mmrazor: None
12/07 14:33:33 - mmengine - INFO - mmyolo: None
I am working in a conda environment and had installed PyTorch using conda.
Error traceback
No response