mmpose
mmpose copied to clipboard
Inference speed is low
Config: "configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py", model: "https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth" FPS: 3-4 torch 1.10.1+cu113 CUDA 11.3 GPU: Nvidia Geforce GTX 1660 Ti
According to this Link I should get at least 6-7 FPS.
@piercus Help me to resolve this issue.
@Harsh-Vavaiya Could you please offer the commands that you run? So that we can see what device and batch size you use, and other useful information. BTW, what is the CPU you used?
@liqikai9 CPU: Intel i5- 9th gen. RAM: 8 GB
Command: python /demo/top_down_video_demo_full_frame_without_det.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth --video-path ../data/demo_video.mp4 --show
@liqikai9 CPU: Intel i5- 9th gen. RAM: 8 GB
Command: python /demo/top_down_video_demo_full_frame_without_det.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth --video-path ../data/demo_video.mp4 --show
It is worth mentioning that the results we show in inference_speed_summary is acquired by running the following command:
python .dev_scripts/benchmark/speed_test.py
In which we omit the time for data pre-processing and only measure the time for model forwarding and data post-processing. More detail about comparison rules can be found here.
In your case, the demo also includes the pre-processing about images. How did you calculate the FPS? Did you include the time for data pre-processing?
I've calculated only inference speed, at line 109 in script.
start = time.time() pose_results, returned_outputs = inference_top_down_pose_model( pose_model, img, person_results, format='xyxy', dataset=dataset, dataset_info=dataset_info, return_heatmap=return_heatmap, outputs=output_layer_names) print(1/(time.time() - start))
like this and torch is also using the GPU.
I've calculated only inference speed, at line 109 in script.
start = time.time() pose_results, returned_outputs = inference_top_down_pose_model( pose_model, img, person_results, format='xyxy', dataset=dataset, dataset_info=dataset_info, return_heatmap=return_heatmap, outputs=output_layer_names) print(1/(time.time() - start))
like this and torch is also using the GPU.
This is a little bit strange. Because in my device with GPU: GeForce GTX 1660 SUPER
, CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
, and Ubuntu 16.04
, using GPU to run this demo script with the default arguments, the FPS reach near 19 FPS. Running this demo script using CPU, the FPS is near 6.
Could you please provide the version of mmpose and the complete script you run? So that we can locate the possible problem. Thanks.
Find the code here Link MMPose Version: 0.22.0 mmcv-full: 1.4.5
I think the problem may be related to the video file you use. What is the resolution of your input video?
You can run the demo script with this command:
python demo/top_down_video_demo_full_frame_without_det.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth --video-path demo/resources/demo.mp4 --show
which will use the demo.mp4
file within mmpose, whose resolution is 960 x 540
. Then check if the fps is still at a low level.
As a reference, in my device with GPU: GeForce GTX 1660 SUPER
, CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GH
z, and Ubuntu 16.04
, using GPU to run the above demo script with the default arguments, the FPS is about 16.
I got FPS around 5.796 using your demo video and command. and the resolution of my input video was 1920*1080.
Is the FPS 5.796 the average value Try to get the average value of FPS if the video is long.
video: demo/resources/demo.mp4 there are only 5 frames. res: 960 x 540 FPS: 0.09442310560884373 2.9375147512923356 4.566262620544298 5.263261982025393 5.796464053886348
mean FPS: around 3.726
I got FPS around 5.796 using your demo video and command. and the resolution of my input video was 1920*1080.
Your script includes the time for data pre-processing work, which is done on the CPU. Since we have different CPU, this is not comparable. Also, I think we should use torch.cuda.synchronize()
before we record the time. You can find more detail about our speed test script here
Try running this command on your machine:
python .dev_scripts/benchmark/speed_test.py
And see if the FPS of this config configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py
is close to the FPS value in the inference_speed_summary
Hi @Harsh-Vavaiya , is there any update on this issue?
After following your instructions, I got the results it looks the same as the table but using mmpose's inference function I'm not getting FPS why this much difference?
The FPS difference is mainly due to that your script includes the time for data pre-processing, which is done on the CPU.
We plan to complement the inference speed summary later, which will include your case. Feel free to let us know if you have any other needs to test the inference speed.