yolov7
yolov7 copied to clipboard
I have a question about detection speed
I trained as a pretrained model with a100 and tried detection, but the speed is about 0.014s (14ms), but the 161fps you mentioned doesn't seem to come out.
What is the average of ms per image when detected with the pretrained model yolov7?
I can achieve 161fps after using tensorrt, but only inference time.
YOLOv7 batch 1 inference time on V100 without TRT: 0.161ms pre-process, 6.163ms inference, 0.727ms NMS. YOLOv7 batch 32 average inference time on V100 without TRT: 0.112ms pre-process, 2.795ms inference, 0.660ms NMS.
YOLOv7 inference time:
- 6ms on GPU V100 (134 TFlops-TC)
- 12ms on GPU T4 (65 TFlops-TC)
It is very strange that you achieve 14ms on GPU A100 (310 TFlops-TC), while there is 12ms on GPU T4 (65 TFlops-TC). GPU T4 is 4x times slower than GPU A100.
Comparison YOLOv7 vs YOLOv5
When tested in an identical environment on a nVidia T4 GPU:
YOLOv7 (51.2% AP, 12.7ms) is 1.5x
times faster and +6.3%
AP more accurate than YOLOv5s6 (44.9% AP, 18.7ms)
https://colab.research.google.com/gist/AlexeyAB/56912451a33981d977ff9ea61025ae40/yolov7trtlinaom.ipynb#scrollTo=-tMYe8f27US9
!python test.py --data data/coco.yaml --img 640 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val
...
Speed: 12.6/0.9/13.5 ms inference/NMS/total per 640x640 image at batch-size 1
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.512
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5s6.pt --name yolov5s6_1280_val
...
Speed: 0.7ms pre-process, 18.7ms inference, 1.7ms NMS per image at shape (1, 3, 1280, 1280)
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.449
YOLOv7 (51.2% AP, 12.6ms) has almost the same accuracy but 4x times faster than YOLOv5m6 (51.3% AP, 49.1ms)
https://colab.research.google.com/gist/AlexeyAB/857c4859a7a27abca8775245884d1ecf/yolov7trtlinaom.ipynb
!python test.py --data data/coco.yaml --img 640 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val
...
Speed: 12.6/0.9/13.5 ms inference/NMS/total per 640x640 image at batch-size 1
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.512
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5m6.pt --name yolov5m6_1280_val
...
Speed: 0.6ms pre-process, 49.1ms inference, 1.7ms NMS per image at shape (1, 3, 1280, 1280)
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.513
More over, YOLOv7-w6 1280x1280 (54.6% AP, 29ms) has comparable accuracy but 6.6x times faster than YOLOv5x6 1280x1280 (55.0% AP, 190ms)
https://colab.research.google.com/gist/AlexeyAB/6f08816fa611def881327de0f5711ae5/yolov7trtlinaom.ipynb
!python test.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7-w6.pt --name yolov7_w6_1280_val
...
Speed: 28.5/1.4/29.9 ms inference/NMS/total per 1280x1280 image at batch-size 1
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.546
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5x6.pt --name yolov5x6_1280_val
...
Speed: 0.6ms pre-process, 190.1ms inference, 1.7ms NMS per image at shape (1, 3, 1280, 1280)
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.550
YOLOv7-e6 1280x1280 (55.9% AP, 47ms) is +0.9%
AP more accurate and 4x times faster than YOLOv5x6 1280x1280 (55.0% AP, 188ms)
https://colab.research.google.com/gist/AlexeyAB/4065112d0d1a252eb433fef38c061f66/yolov7trtlinaom.ipynb#scrollTo=JE3o2PqQy2n4
!python test.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7-e6.pt --name yolov7_e6_1280_val
...
Speed: 46.6/1.5/48.1 ms inference/NMS/total per 1280x1280 image at batch-size 1
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.559
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5x6.pt --name yolov5x6_1280_val
...
Speed: 0.7ms pre-process, 188.1ms inference, 1.8ms NMS per image at shape (1, 3, 1280, 1280)
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.550
YOLOV7-tiny inference time is slower than yolov5n in same image size like 640. Are there any changes needed when using detect.py?