mmdeploy icon indicating copy to clipboard operation
mmdeploy copied to clipboard

mmdet:3.x error with mmdeploy:dev-1.x

Open shuxp opened this issue 3 years ago • 5 comments

Checklist

  • [X] I have searched related issues but cannot get the expected help.
  • [X] 2. I have read the FAQ documentation but cannot get the expected help.
  • [X] 3. The bug has not been fixed in the latest version.

Describe the bug

With latest mmdet-3.x/mmdeploy-dev1.x branch, running 'python tools/deploy.py' get error before generatiing onnx file.

Reproduction

1, preparation

mmcv

mim install "mmcv>=2.0.0rc1"

mmdet

  • git clone https://github.com/open-mmlab/mmdetection.git -b 3.x cd mmdetection pip install -v -e .

mmengine

  • git clone https://github.com/open-mmlab/mmengine.git cd mmdetection pip install -v -e .

data(weight file .pth and config file .py)

mim download mmdet --config faster-rcnn_r50_fpn_1x_coco --dest .

2, before running, I changed some code to run the following commands

  • comment out deepcopy in mmdeploy/codebase/mmdet/models/detectors/two_stage.py, because it failed in my env
# data_samples = copy.deepcopy(data_samples)
  • change data type from list to dict ( located in mmdeploy/codebase/mmdet/deploy/object_detection.py), in order to fit BaseDataPreprocessor in mmengine.
        # data = []
        data = dict(inputs=[], data_samples=[])
        for img in imgs:
            # prepare data
            if isinstance(img, np.ndarray):
                # TODO: remove img_id.
                data_ = dict(img=img, img_id=0)
            else:
                # TODO: remove img_id.
                data_ = dict(img_path=img, img_id=0)
            # build the data pipeline
            data_ = test_pipeline(data_)
            # data.append(data_)
            data['inputs'].append(data_['inputs'])
            data['data_samples'].append(data_['data_samples'])

        # data = data[0]
        if data_preprocessor is not None:
            data = data_preprocessor(data, False)
            return data, data['inputs']

3, running commands

  • except code modified above, nothing changed

  • python tools\deploy.py configs\mmdet\detection\detection_tensorrt_dynamic-320x320-1344x1344.py ..\faster-rcnn_r50_fpn_1x_coco.py ..\faster_rcnn_r50_fpn_1x_coco_20200130-047 c8118.pth ..\mmdetection\demo\demo.jpg --work-dir work_dir --show --device cuda:0 --dump-info

  • When debug into inside of code, error occured here(torch/onnx/utils.py#444):

graph_inputs = list(graph.inputs())
  • The calling pipeline:
torch.onnx.export(export.py) 
--> wrapper(rewriter_utils.py) 
--> model_to_graph_custom_optimizer(optimizer.py) 
--> _model_to_graph(torch.onnx.utils) 
--> _create_jit_graph(torch.onnx.utils) line 444)

Environment

(trt8.4) D:\1-code\mmlab\mmdeploy>python tools\check_env.py
09/13 21:29:51 - mmengine - INFO -

09/13 21:29:51 - mmengine - INFO - **********Environmental information**********
09/13 21:29:54 - mmengine - INFO - sys.platform: win32
09/13 21:29:54 - mmengine - INFO - Python: 3.9.12 (main, Apr  4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)]
09/13 21:29:54 - mmengine - INFO - CUDA available: True
09/13 21:29:54 - mmengine - INFO - numpy_random_seed: 2147483648
09/13 21:29:54 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
09/13 21:29:54 - mmengine - INFO - CUDA_HOME: D:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
09/13 21:29:54 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58
09/13 21:29:54 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.29.30146 版
09/13 21:29:54 - mmengine - INFO - GCC: n/a
09/13 21:29:54 - mmengine - INFO - PyTorch: 1.11.0+cu113
09/13 21:29:54 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_
70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32
/D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNP
ACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1
, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

09/13 21:29:54 - mmengine - INFO - TorchVision: 0.12.0+cu113
09/13 21:29:54 - mmengine - INFO - OpenCV: 4.6.0
09/13 21:29:54 - mmengine - INFO - MMEngine: 0.1.0
09/13 21:29:54 - mmengine - INFO - MMCV: 2.0.0rc1
09/13 21:29:54 - mmengine - INFO - MMCV Compiler: MSVC 192829924
09/13 21:29:54 - mmengine - INFO - MMCV CUDA Compiler: 11.3
09/13 21:29:54 - mmengine - INFO - MMDeploy: 0.7.0+0aad635
09/13 21:29:54 - mmengine - INFO -

09/13 21:29:54 - mmengine - INFO - **********Backend information**********
09/13 21:29:54 - mmengine - INFO - onnxruntime: None    ops_is_avaliable : False
09/13 21:29:54 - mmengine - INFO - tensorrt: 8.4.3.1    ops_is_avaliable : False
09/13 21:29:55 - mmengine - INFO - ncnn: None   ops_is_avaliable : False
09/13 21:29:55 - mmengine - INFO - pplnn_is_avaliable: False
09/13 21:29:55 - mmengine - INFO - openvino_is_avaliable: False
09/13 21:29:55 - mmengine - INFO - snpe_is_available: False
09/13 21:29:55 - mmengine - INFO -

09/13 21:29:55 - mmengine - INFO - **********Codebase information**********
09/13 21:29:55 - mmengine - INFO - mmdet:       3.0.0rc0
09/13 21:29:55 - mmengine - INFO - mmseg:       None
09/13 21:29:55 - mmengine - INFO - mmcls:       None
09/13 21:29:55 - mmengine - INFO - mmocr:       None
09/13 21:29:55 - mmengine - INFO - mmedit:      None
09/13 21:29:55 - mmengine - INFO - mmdet3d:     None
09/13 21:29:55 - mmengine - INFO - mmpose:      None
09/13 21:29:55 - mmengine - INFO - mmrotate:    None

Error traceback

the same as https://github.com/open-mmlab/mmdeploy/issues/803#issuecomment-1200153598.

shuxp avatar Sep 13 '22 14:09 shuxp

@lvhan028

tpoisonooo avatar Sep 13 '22 14:09 tpoisonooo

#979 is in reviewing. You can try on #979 branch.

hanrui1sensetime avatar Sep 14 '22 03:09 hanrui1sensetime

This issue occured with 3060 laptop. But with 2080 pc, I can convert model successfully.

But, when inference with SDK, error occured. (In 2080 machine)

some modification with CMakeLists.txt:

set(OpenCV_DIR "F:\\3-library\\opencv\\build")
set(TENSORRT_DIR "F:\\3-library\\TensorRT-8.4.3.1")
set(CUDNN_DIR "F:\\3-library\\cudnn-windows-x86_64-8.4.1.50_cuda11.6-archive")
set(pplcv_DIR "F:\\1-code\\cpp\\ppl.cv\\pplcv-build\\install\\lib\\cmake\\ppl")

# options
option(MMDEPLOY_SHARED_LIBS "build shared libs" ON)
option(MMDEPLOY_BUILD_SDK "build MMDeploy SDK" ON)
option(MMDEPLOY_BUILD_SDK_MONOLITHIC "build single lib for SDK API" OFF)
option(MMDEPLOY_BUILD_TEST "build unittests" OFF)
option(MMDEPLOY_BUILD_SDK_PYTHON_API "build SDK Python API" ON)
option(MMDEPLOY_BUILD_SDK_CXX_API "build SDK C++ API" ON)
option(MMDEPLOY_BUILD_SDK_CSHARP_API "build SDK C# API support" ON)
option(MMDEPLOY_BUILD_SDK_JAVA_API "build SDK JAVA API" OFF)
option(MMDEPLOY_BUILD_EXAMPLES "build examples" ON)
option(MMDEPLOY_SPDLOG_EXTERNAL "use external spdlog" OFF)
option(MMDEPLOY_ZIP_MODEL "support SDK model in zip format" OFF)
option(MMDEPLOY_COVERAGE "build SDK for coverage" OFF)

set(MMDEPLOY_TARGET_DEVICES "cuda" CACHE STRING "target devices to support")
set(MMDEPLOY_TARGET_BACKENDS "trt" CACHE STRING "target inference engines to support")
set(MMDEPLOY_CODEBASES "mmdet" CACHE STRING "select OpenMMLab codebases")

running commands

  • mim download mmdet --config faster-rcnn_r50_fpn_1x_coco --dest .
  • python tools\deploy.py configs\mmdet\detection\detection_tensorrt_dynamic-320x320-1344x1344.py ..\faster-rcnn_r50_fpn_1x_coco.py ..\faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth ..\mmdetection\demo\demo.jpg --work-dir work_dir --show --device cuda:0 --dump-info
  • cd mmdeploy/build/bin/Release
  • object_detection.exe cuda mmdeploy/work_dir mmdetection/demo/demo.jpg (failed)
F:\1-code\mmlab\mmdeploy\demo\csrc-run\build\Debug>object_detection.exe 
loading mmdeploy_execution ...
loading mmdeploy_cpu_device ...
loading mmdeploy_cuda_device ...
loading mmdeploy_graph ...
loading mmdeploy_directory_model ...
[2022-09-14 11:24:43.179] [mmdeploy] [info] [model.cpp:95] Register 'DirectoryModel'
loading mmdeploy_transform ...
loading mmdeploy_cpu_transform_impl ...
loading mmdeploy_cuda_transform_impl ...
loading mmdeploy_transform_module ...
loading mmdeploy_trt_net ...
loading mmdeploy_net_module ...
loading mmdeploy_mmdet ...
[2022-09-14 11:24:43.243] [mmdeploy] [info] [model.cpp:38] DirectoryModel successfully load model F:\1-code\mmlab\mmdeploy\work_dir
[2022-09-14 11:24:43.379] [mmdeploy] [error] [compose.cpp:24] Unable to find Transform creator: LoadAnnotations. Available transforms: ["CenterCrop", "Collect", "Compose", "DefaultFormatBundle", "ImageToTensor", "LoadImageFromFile", "Normalize", "Pad", "Resize"]
[2022-09-14 11:24:43.380] [mmdeploy] [error] [task.cpp:67] error parsing config: {
  "context": {
    "device": "<any>",
    "model": "<any>",
    "stream": "<any>"
  },
  "input": [
    "img"
  ],
  "module": "Transform",
  "name": "Preprocess",
  "output": [
    "prep_output"
  ],
  "transforms": [
    {
      "file_client_args": {
        "backend": "disk"
      },
      "type": "LoadImageFromFile"
    },
    {
      "keep_ratio": true,
      "scale": [
        1333,
        800
      ],
      "type": "Resize"
    },
    {
      "type": "LoadAnnotations",
      "with_bbox": true
    },
    {
      "meta_keys": [
        "img_id",
        "img_path",
        "ori_shape",
        "img_shape",
        "scale_factor"
      ],
      "type": "PackDetInputs"
    }
  ],
  "type": "Task"
}
[2022-09-14 11:24:43.385] [mmdeploy] [error] [pipeline.cpp:145] Could not create Task: Preprocess
[2022-09-14 11:24:43.385] [mmdeploy] [error] [pipeline.cpp:159] error parsing config: unknown (6)
[2022-09-14 11:24:43.385] [mmdeploy] [error] [handle.h:31] Failed to create pipeline, config: {
  "context": {
    "device": "<any>",
    "stream": "<any>"
  },
  "pipeline": {
    "input": [
      "image"
    ],
    "output": [
      "det"
    ],
    "tasks": [
      {
        "input": [
          "image"
        ],
        "name": "mmdetection",
        "output": [
          "det"
        ],
        "params": {
          "model": "<any>"
        },
        "type": "Inference"
      }
    ]
  }
}
[2022-09-14 11:24:43.389] [mmdeploy] [error] [pipeline.cpp:27] exception caught: unknown (6)
failed to create detector, code: 6

shuxp avatar Sep 14 '22 04:09 shuxp

my env in 2080 machine:

(mmdet3) F:\1-code\mmlab\mmdeploy>python tools\check_env.py
09/14 12:47:03 - mmengine - INFO -                                              
                                                                                
09/14 12:47:03 - mmengine - INFO - **********Environmental information**********
09/14 12:47:14 - mmengine - INFO - sys.platform: win32                                                                     
09/14 12:47:14 - mmengine - INFO - Python: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
09/14 12:47:14 - mmengine - INFO - CUDA available: True                                                                    
09/14 12:47:14 - mmengine - INFO - numpy_random_seed: 2147483648                                                           
09/14 12:47:14 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 2080                                                          
09/14 12:47:14 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3                     
09/14 12:47:14 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58                                    
09/14 12:47:14 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.33.31629 版                         
09/14 12:47:14 - mmengine - INFO - GCC: n/a
09/14 12:47:14 - mmengine - INFO - PyTorch: 1.11.0+cu113
09/14 12:47:14 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_7
5,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /b
igobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILE
R_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_N
CCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

09/14 12:47:14 - mmengine - INFO - TorchVision: 0.12.0+cu113
09/14 12:47:14 - mmengine - INFO - OpenCV: 4.6.0
09/14 12:47:14 - mmengine - INFO - MMEngine: 0.1.0
09/14 12:47:14 - mmengine - INFO - MMCV: 2.0.0rc1
09/14 12:47:14 - mmengine - INFO - MMCV Compiler: MSVC 192930146
09/14 12:47:14 - mmengine - INFO - MMCV CUDA Compiler: 11.3
09/14 12:47:14 - mmengine - INFO - MMDeploy: 0.7.0+16e9e04
09/14 12:47:14 - mmengine - INFO -

09/14 12:47:14 - mmengine - INFO - **********Backend information**********
09/14 12:47:15 - mmengine - INFO - onnxruntime: 1.12.1  ops_is_avaliable : False
09/14 12:47:15 - mmengine - INFO - tensorrt: 8.4.3.1    ops_is_avaliable : True
09/14 12:47:15 - mmengine - INFO - ncnn: None   ops_is_avaliable : False
09/14 12:47:15 - mmengine - INFO - pplnn_is_avaliable: False
09/14 12:47:15 - mmengine - INFO - openvino_is_avaliable: False
09/14 12:47:15 - mmengine - INFO - snpe_is_available: False
09/14 12:47:15 - mmengine - INFO -

09/14 12:47:15 - mmengine - INFO - **********Codebase information**********
09/14 12:47:15 - mmengine - INFO - mmdet:       3.0.0rc0
09/14 12:47:15 - mmengine - INFO - mmseg:       None
09/14 12:47:15 - mmengine - INFO - mmcls:       None
09/14 12:47:15 - mmengine - INFO - mmocr:       None
09/14 12:47:15 - mmengine - INFO - mmedit:      None
09/14 12:47:15 - mmengine - INFO - mmdet3d:     None
09/14 12:47:15 - mmengine - INFO - mmpose:      None
09/14 12:47:15 - mmengine - INFO - mmrotate:    None

shuxp avatar Sep 14 '22 04:09 shuxp

#979 is in reviewing. You can try on #979 branch.

This branch just fix some code error like what I did above. It not solve this problem from source...

shuxp avatar Sep 14 '22 05:09 shuxp

#979 is in reviewing. You can try on #979 branch.

This branch just fix some code error like what I did above. It not solve this problem from source...

Now mmdeploy:dev-1.x update a lot, please upgrade mmdet, mmengine, mmcv and mmdeploy and have a try.

hanrui1sensetime avatar Nov 21 '22 08:11 hanrui1sensetime

#979 is in reviewing. You can try on #979 branch.

This branch just fix some code error like what I did above. It not solve this problem from source...

Now mmdeploy:dev-1.x update a lot, please upgrade mmdet, mmengine, mmcv and mmdeploy and have a try.

on branch 1.0.0rc0 and branch dev-1.x, same error accurs.

For OpenMMLab 2.0 projects, with 3070 on windows, never success from September to November, from dev-1.x branch to 1.0.0rc0 branch.

when debug into the code, torch/onnx/utils.py, function "_create_jit_graph", graph.inputs() can iterate, but can not convert to list( pending when convert, then exit).

Maybe this just happened on windows or 3070.

shuxp avatar Dec 07 '22 02:12 shuxp

Deploy using windows has lots of users for production. Hope someone can help me with this. Thanks.

shuxp avatar Dec 07 '22 02:12 shuxp

@irexyc would you please help checking this issue on Windows platform?

lvhan028 avatar Dec 07 '22 14:12 lvhan028