yolov9
yolov9 copied to clipboard
TRTExec Results
Hi, I converted both yolov8nano.pt and yolov9-t-converted.pt to onnx files with arguments simplify=True, opset=12, imgsz=640. Then I tested them on jetson nano devices which is using jetpack 4.6 by using trtexec. However, still yolov8n.pt seems faster than yours. What can be the problem? Here are the results:
yolov8:
[07/31/2024-22:47:15] [I] === Performance summary ===
[07/31/2024-22:47:15] [I] Throughput: 18.9615 qps
[07/31/2024-22:47:15] [I] Latency: min = 52.4875 ms, max = 57.1544 ms, mean = 52.7275 ms, median = 52.6539 ms, percentile(99%) = 57.1544 ms
[07/31/2024-22:47:15] [I] End-to-End Host Latency: min = 52.4974 ms, max = 57.1649 ms, mean = 52.7378 ms, median = 52.6639 ms, percentile(99%) = 57.1649 ms
[07/31/2024-22:47:15] [I] Enqueue Time: min = 6.0144 ms, max = 18.371 ms, mean = 11.6272 ms, median = 10.9541 ms, percentile(99%) = 18.371 ms
[07/31/2024-22:47:15] [I] H2D Latency: min = 0.471436 ms, max = 0.490677 ms, mean = 0.475766 ms, median = 0.475586 ms, percentile(99%) = 0.490677 ms
[07/31/2024-22:47:15] [I] GPU Compute Time: min = 51.7396 ms, max = 56.3861 ms, mean = 51.9737 ms, median = 51.9009 ms, percentile(99%) = 56.3861 ms
[07/31/2024-22:47:15] [I] D2H Latency: min = 0.275635 ms, max = 0.281738 ms, mean = 0.278016 ms, median = 0.277588 ms, percentile(99%) = 0.281738 ms
[07/31/2024-22:47:15] [I] Total Host Walltime: 3.11157 s
[07/31/2024-22:47:15] [I] Total GPU Compute Time: 3.06645 s
[07/31/2024-22:47:15] [I] Explanations of the performance metrics are printed in the verbose logs.
yolov9:
[07/31/2024-21:47:12] [I] === Performance summary ===
[07/31/2024-21:47:12] [I] Throughput: 15.1949 qps
[07/31/2024-21:47:12] [I] Latency: min = 65.0674 ms, max = 84.4445 ms, mean = 65.7986 ms, median = 65.176 ms, percentile(99%) = 84.4445 ms
[07/31/2024-21:47:12] [I] End-to-End Host Latency: min = 65.0799 ms, max = 84.4609 ms, mean = 65.8111 ms, median = 65.1885 ms, percentile(99%) = 84.4609 ms
[07/31/2024-21:47:12] [I] Enqueue Time: min = 26.2678 ms, max = 268.322 ms, mean = 33.2359 ms, median = 27.5183 ms, percentile(99%) = 268.322 ms
[07/31/2024-21:47:12] [I] H2D Latency: min = 0.47168 ms, max = 0.929291 ms, mean = 0.484728 ms, median = 0.473633 ms, percentile(99%) = 0.929291 ms
[07/31/2024-21:47:12] [I] GPU Compute Time: min = 64.319 ms, max = 83.2169 ms, mean = 65.0383 ms, median = 64.4297 ms, percentile(99%) = 83.2169 ms
[07/31/2024-21:47:12] [I] D2H Latency: min = 0.273193 ms, max = 0.29834 ms, mean = 0.27559 ms, median = 0.274902 ms, percentile(99%) = 0.29834 ms
[07/31/2024-21:47:12] [I] Total Host Walltime: 2.96153 s
[07/31/2024-21:47:12] [I] Total GPU Compute Time: 2.92672 s
[07/31/2024-21:47:12] [I] Explanations of the performance metrics are printed in the verbose logs.