sahi
sahi copied to clipboard
Mmdet TensorRT support
- TensorRT model wrapper for both instance seg. and object detector added to mmdet.py
GPU Info: NVIDIA GeForce RTX 3090
Model used: rtmdet_l_8xb32-300e_coco.py
Deployment cfg file used: detection_tensorrt-fp16_static-640x640.py -> you may find examples here
Average inference times(Just the _call_ method of the wrappers used):
- Vanilla model ~ 28 ms
- TensorRT model ~ 14 ms
Average inference times(get_sliced_prediction function):
- image used with a size of (height=1432, width=4089, 3) and slice used with a size of (height=640,width=640)
- Vanilla model ~ 1.4 s
- TensorRT model ~ 1 s
The difference was around 400 ms but it might yield significant speed improvement for edge devices such as jetson.
Disclaimer: This implementation is not perfect as it could be generalized to work for other frameworks(yolo, detectron2, etc.). I needed this for my current project and wanted to share it as a base for anyone interested.
Example usage:
deploy_config_path = None
category_mapping = None
path_detector = "detection.pth"
if trt:
path_detector = "end2end.engine"
deploy_config_path = "detection_tensorrt-fp16_static-640x640.py"
category_mapping = {id:class} #Class mapping needed
model = AutoDetectionModel.from_pretrained(
model_type="mmdet",
model_path=path_detector,
deploy_config_path=deploy_config_path,
config_path="rtmdet_l_8xb32-300e_coco.py",
category_mapping=category_mapping,
device=device,
)