Unable to use converted YOLOv3 model
Describe the bug
I'm trying to convert trained model based on yolov3 from mmdet in order to use in NVIDIA Triton inference server.
Conversion using mmdet2trt finished successfully, but when I try to use model using inference_detector it throws exception
WARNING:root:module mmdet.models.dense_heads.TransformerHead not exist.
Use load_from_local loader
[TensorRT] INFO: Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[TensorRT] INFO: Detected 1 inputs and 4 output network tensors.
[TensorRT] ERROR: Parameter check failed at: engine.cpp::setBindingDimensions::1137, condition: profileMinDims.d[i] <= dimensions.d[i]
[TensorRT] ERROR: Parameter check failed at: engine.cpp::resolveSlots::1318, condition: allInputDimensionsSpecified(routine)
Traceback (most recent call last):
File "converter/mmdetection-to-tensorrt/demo/inference.py", line 63, in <module>
main()
File "converter/mmdetection-to-tensorrt/demo/inference.py", line 33, in main
result = inference_detector(trt_model, image_path, cfg_path, args.device)
File "/workspace/converter/mmdetection-to-tensorrt/mmdet2trt/apis/inference.py", line 48, in inference_detector
result = model(tensor)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/workspace/converter/torch2trt_dynamic/torch2trt_dynamic/torch2trt_dynamic.py", line 478, in forward
shape = tuple(self.context.get_binding_shape(idx))
ValueError: __len__() should return >= 0
mmdetection-to-tensorrt/demo/inference.py fails with same error message
To Reproduce
- Download model checkpoint and config from here
- Run
python converter/mmdetection-to-tensorrt/demo/inference.py \
test.jpg \
yolo_cropper.py \
yolo_cropper.pth \
yolo_cropper.trt.pth
enviroment:
- Host OS: Manjaro Linux
- Dev Container: based from Docker image
nvcr.io/nvidia/tensorrt:21.06-py3 - python_version: 3.8.5
- pytorch_version: 1.8.1+cu111
- cuda_version: 11.3.1
- cudnn_version: 8.2.1
- mmdetection_version: 2.12.0
Additional context Add any other context about the problem here.
You might need to set different opt_shape_param when you convert your model. since the default config is for the two-stage model or retinanet-like model.
read this for detail.
Thank you, I think changing min_shape solved that problem. But now, I have problem that when I run model in Triton, it fails with message
I0709 10:05:41.195130 1 plan_backend.cc:2513] Running yolo_cropper_0_gpu0 with 1 requests
I0709 10:05:41.195174 1 plan_backend.cc:3431] Optimization profile default [0] is selected for yolo_cropper_0_gpu0
I0709 10:05:41.195216 1 pinned_memory_manager.cc:161] pinned memory allocation: size 3326976, addr 0x7f378a000090
I0709 10:05:41.195995 1 plan_backend.cc:2936] Context with profile default [0] is being executed for yolo_cropper_0_gpu0
E0709 10:05:41.196090 1 logging.cc:43] (Unnamed Layer* 214) [Concatenation]: dimensions not compatible for concatenation
E0709 10:05:41.196099 1 logging.cc:43] shapeMachine.cpp (276) - Shape Error in operator(): condition '==' violated
E0709 10:05:41.196104 1 logging.cc:43] Instruction: CHECK_EQUAL 30 29
Do you know, to what concatenation it could be referring to?
Input I pass is shaped (1, 3, 608, 456)
Errr, I do not have much experience with Triton. According to the logs, It seems like the input tensor shape is not 32 multiples(this limit comes from mmdet). Please check if the preprocess is the same as mmdetection.
Thank you for direction. It was seems like it was indeed padding problem, interestingly this problem did not occur with Faster-RCNN model.
However, now I have problem that triton just exists just with this line printed
#assertion/workspace/converter/amirstan_plugin/src/plugin/batchedNMSPlugin/batchedNMSPlugin.cpp,132
https://github.com/grimoire/amirstan_plugin/blob/ca8d16fadbf169edcf27541d4044fc2115544998/src/plugin/batchedNMSPlugin/batchedNMSPlugin.cpp#L132
Which I believe is related to amirstan_plugin, NVM layer in particular. Is there any way of enabling logging for that plugin in order to debug this situation?
I just simply add print in code and build again.