paddle Inference推理速度问题 yolov5系列 yolov5n
问题确认 Search before asking
- [X] 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.
请提出你的问题 Please ask your question
打印信息 : 2024-10-18 16:36:09,082 - benchmark_utils - INFO - Paddle Inference benchmark log will be saved to /home/aistudio/PaddleYOLO/deploy/python/../../output/yolov5_n_300e_coco.log 2024-10-18 16:36:09,083 - benchmark_utils - INFO -
2024-10-18 16:36:09,083 - benchmark_utils - INFO - ---------------------- Paddle info ---------------------- 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] paddle_version: 3.0.0-beta1 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] paddle_commit: a842a0f40f6111fb0c2df218130d0560aa747bc8 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] paddle_branch: HEAD 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] log_api_version: 1.0.3 2024-10-18 16:36:09,083 - benchmark_utils - INFO - ----------------------- Conf info ----------------------- 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] runtime_device: gpu 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] ir_optim: True 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] enable_memory_optim: True 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] enable_tensorrt: False 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] enable_mkldnn: True 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] cpu_math_library_num_threads: 1 2024-10-18 16:36:09,083 - benchmark_utils - INFO - ----------------------- Model info ---------------------- 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] model_name: yolov5_n_300e_coco 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] precision: paddle 2024-10-18 16:36:09,083 - benchmark_utils - INFO - ----------------------- Data info ----------------------- 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] batch_size: 1 2024-10-18 16:36:09,083 - benchmark_utils - INFO - [DET] input_shape: dynamic_shape 2024-10-18 16:36:09,084 - benchmark_utils - INFO - [DET] data_num: 26 2024-10-18 16:36:09,084 - benchmark_utils - INFO - ----------------------- Perf info ----------------------- 2024-10-18 16:36:09,084 - benchmark_utils - INFO - [DET] cpu_rss(MB): 3586, cpu_vms: 0, cpu_shared_mb: 0, cpu_dirty_mb: 0, cpu_util: 0% 2024-10-18 16:36:09,084 - benchmark_utils - INFO - [DET] gpu_rss(MB): 1803, gpu_util: 25.0%, gpu_mem_util: 0% 2024-10-18 16:36:09,084 - benchmark_utils - INFO - [DET] total time spent(s): 0.9325 2024-10-18 16:36:09,084 - benchmark_utils - INFO - [DET] preprocess_time(ms): 22.2, inference_time(ms): 13.7, postprocess_time(ms): 0.0
执行命令: python deploy/python/infer.py --model_dir output_inference/yolov5_n_300e_coco/ --image_dir dataset/coco_ssdd/mytestimgs/ --device gpu --batch_size 20 --run_benchmark=True
我使用这样,得到的Inference_time平均是 13.7ms;但是如果我不使用 “ --run_benchmark=True” ,那么打印出来的平均时间是69ms。 这两种得到的结果,区别为什么这么大?
------------------ Inference Time Info ---------------------- total_time(ms): 2579.2000000000003, img_num: 26 average latency time(ms): 99.20, QPS: 10.080645 preprocess_time(ms): 30.20, inference_time(ms): 69.00, postprocess_time(ms): 0.00。
我同样用rtx 4090 采用 --run_benchmark=True的形式进行测试,推理时间是4ms左右,但官方页面使用的tesla T4,能达到1.5ms??
2024-10-18 16:46:00,077 - benchmark_utils - INFO - ---------------------- Paddle info ---------------------- 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] paddle_version: 2.4.1 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] paddle_commit: 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] paddle_branch: branch: 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] log_api_version: 1.0.3 2024-10-18 16:46:00,078 - benchmark_utils - INFO - ----------------------- Conf info ----------------------- 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] runtime_device: gpu 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] ir_optim: True 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] enable_memory_optim: True 2024-10-18 16:46:00,078 - benchmark_utils - INFO - [DET] enable_tensorrt: False 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] enable_mkldnn: False 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] cpu_math_library_num_threads: 1 2024-10-18 16:46:00,079 - benchmark_utils - INFO - ----------------------- Model info ---------------------- 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] model_name: yolov5_n_300e_coco 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] precision: paddle 2024-10-18 16:46:00,079 - benchmark_utils - INFO - ----------------------- Data info ----------------------- 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] batch_size: 1 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] input_shape: dynamic_shape 2024-10-18 16:46:00,079 - benchmark_utils - INFO - [DET] data_num: 39 2024-10-18 16:46:00,080 - benchmark_utils - INFO - ----------------------- Perf info ----------------------- 2024-10-18 16:46:00,080 - benchmark_utils - INFO - [DET] cpu_rss(MB): 71, cpu_vms: 0, cpu_shared_mb: 0, cpu_dirty_mb: 0, cpu_util: 0% 2024-10-18 16:46:00,080 - benchmark_utils - INFO - [DET] gpu_rss(MB): 89, gpu_util: 1.18%, gpu_mem_util: 0% 2024-10-18 16:46:00,080 - benchmark_utils - INFO - [DET] total time spent(s): 0.828 2024-10-18 16:46:00,080 - benchmark_utils - INFO - [DET] preprocess_time(ms): 16.9, inference_time(ms): 4.4, postprocess_time(ms): 0.0
速度测试请参考链接
The issue has no response for a long time and will be closed. You can reopen or new another issue if are still confused.
From Bot