yolov7 icon indicating copy to clipboard operation
yolov7 copied to clipboard

I have a question about detection speed

Open StringMin opened this issue 1 year ago • 4 comments

Screenshot from 2022-07-28 17-31-08

I trained as a pretrained model with a100 and tried detection, but the speed is about 0.014s (14ms), but the 161fps you mentioned doesn't seem to come out.

What is the average of ms per image when detected with the pretrained model yolov7?

StringMin avatar Jul 28 '22 08:07 StringMin

I can achieve 161fps after using tensorrt, but only inference time.

deepthman avatar Jul 28 '22 08:07 deepthman

YOLOv7 batch 1 inference time on V100 without TRT: 0.161ms pre-process, 6.163ms inference, 0.727ms NMS. YOLOv7 batch 32 average inference time on V100 without TRT: 0.112ms pre-process, 2.795ms inference, 0.660ms NMS.

WongKinYiu avatar Jul 28 '22 09:07 WongKinYiu

YOLOv7 inference time:

  • 6ms on GPU V100 (134 TFlops-TC)
  • 12ms on GPU T4 (65 TFlops-TC)

It is very strange that you achieve 14ms on GPU A100 (310 TFlops-TC), while there is 12ms on GPU T4 (65 TFlops-TC). GPU T4 is 4x times slower than GPU A100.


Comparison YOLOv7 vs YOLOv5

When tested in an identical environment on a nVidia T4 GPU:

YOLOv7 (51.2% AP, 12.7ms) is 1.5x times faster and +6.3% AP more accurate than YOLOv5s6 (44.9% AP, 18.7ms)

https://colab.research.google.com/gist/AlexeyAB/56912451a33981d977ff9ea61025ae40/yolov7trtlinaom.ipynb#scrollTo=-tMYe8f27US9

!python test.py --data data/coco.yaml --img 640 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val
...
Speed: 12.6/0.9/13.5 ms inference/NMS/total per 640x640 image at batch-size 1
...
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.512
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5s6.pt --name yolov5s6_1280_val
...
Speed: 0.7ms pre-process, 18.7ms inference, 1.7ms NMS per image at shape (1, 3, 1280, 1280)
...
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.449

YOLOv7 (51.2% AP, 12.6ms) has almost the same accuracy but 4x times faster than YOLOv5m6 (51.3% AP, 49.1ms)

https://colab.research.google.com/gist/AlexeyAB/857c4859a7a27abca8775245884d1ecf/yolov7trtlinaom.ipynb

!python test.py --data data/coco.yaml --img 640 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val
...
Speed: 12.6/0.9/13.5 ms inference/NMS/total per 640x640 image at batch-size 1
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.512
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5m6.pt --name yolov5m6_1280_val
...
Speed: 0.6ms pre-process, 49.1ms inference, 1.7ms NMS per image at shape (1, 3, 1280, 1280)
...
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.513

More over, YOLOv7-w6 1280x1280 (54.6% AP, 29ms) has comparable accuracy but 6.6x times faster than YOLOv5x6 1280x1280 (55.0% AP, 190ms)

https://colab.research.google.com/gist/AlexeyAB/6f08816fa611def881327de0f5711ae5/yolov7trtlinaom.ipynb

!python test.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7-w6.pt --name yolov7_w6_1280_val
...
Speed: 28.5/1.4/29.9 ms inference/NMS/total per 1280x1280 image at batch-size 1
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.546
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5x6.pt --name yolov5x6_1280_val
...
Speed: 0.6ms pre-process, 190.1ms inference, 1.7ms NMS per image at shape (1, 3, 1280, 1280)
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.550

YOLOv7-e6 1280x1280 (55.9% AP, 47ms) is +0.9% AP more accurate and 4x times faster than YOLOv5x6 1280x1280 (55.0% AP, 188ms)

https://colab.research.google.com/gist/AlexeyAB/4065112d0d1a252eb433fef38c061f66/yolov7trtlinaom.ipynb#scrollTo=JE3o2PqQy2n4

!python test.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov7-e6.pt --name yolov7_e6_1280_val
...
Speed: 46.6/1.5/48.1 ms inference/NMS/total per 1280x1280 image at batch-size 1
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.559
!python val.py --data data/coco.yaml --img 1280 --batch 1 --conf 0.001 --iou 0.65 --device 0 --weights yolov5x6.pt --name yolov5x6_1280_val
...
Speed: 0.7ms pre-process, 188.1ms inference, 1.8ms NMS per image at shape (1, 3, 1280, 1280)
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.550

AlexeyAB avatar Jul 28 '22 19:07 AlexeyAB

YOLOV7-tiny inference time is slower than yolov5n in same image size like 640. Are there any changes needed when using detect.py?

nonen7676 avatar Jul 30 '22 08:07 nonen7676