mmpose Inference speed is low

Config: "configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py", model: "https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth" FPS: 3-4 torch 1.10.1+cu113 CUDA 11.3 GPU: Nvidia Geforce GTX 1660 Ti

According to this Link I should get at least 6-7 FPS.

@piercus Help me to resolve this issue.

Mar 31 '22 06:03 Harsh-Vavaiya

@Harsh-Vavaiya Could you please offer the commands that you run? So that we can see what device and batch size you use, and other useful information. BTW, what is the CPU you used?

Mar 31 '22 09:03 liqikai9

@liqikai9 CPU: Intel i5- 9th gen. RAM: 8 GB

Command: python /demo/top_down_video_demo_full_frame_without_det.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth --video-path ../data/demo_video.mp4 --show

Mar 31 '22 10:03 Harsh-Vavaiya

@liqikai9 CPU: Intel i5- 9th gen. RAM: 8 GB

Command: python /demo/top_down_video_demo_full_frame_without_det.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth --video-path ../data/demo_video.mp4 --show

It is worth mentioning that the results we show in inference_speed_summary is acquired by running the following command:

python .dev_scripts/benchmark/speed_test.py

In which we omit the time for data pre-processing and only measure the time for model forwarding and data post-processing. More detail about comparison rules can be found here.

In your case, the demo also includes the pre-processing about images. How did you calculate the FPS? Did you include the time for data pre-processing?

Mar 31 '22 11:03 liqikai9

I've calculated only inference speed, at line 109 in script.

start = time.time() pose_results, returned_outputs = inference_top_down_pose_model( pose_model, img, person_results, format='xyxy', dataset=dataset, dataset_info=dataset_info, return_heatmap=return_heatmap, outputs=output_layer_names) print(1/(time.time() - start))

like this and torch is also using the GPU.

Mar 31 '22 13:03 Harsh-Vavaiya

I've calculated only inference speed, at line 109 in script.

start = time.time() pose_results, returned_outputs = inference_top_down_pose_model( pose_model, img, person_results, format='xyxy', dataset=dataset, dataset_info=dataset_info, return_heatmap=return_heatmap, outputs=output_layer_names) print(1/(time.time() - start))

like this and torch is also using the GPU.

This is a little bit strange. Because in my device with GPU: GeForce GTX 1660 SUPER , CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz, and Ubuntu 16.04, using GPU to run this demo script with the default arguments, the FPS reach near 19 FPS. Running this demo script using CPU, the FPS is near 6.

Could you please provide the version of mmpose and the complete script you run? So that we can locate the possible problem. Thanks.

Apr 01 '22 02:04 liqikai9

Find the code here Link MMPose Version: 0.22.0 mmcv-full: 1.4.5

Apr 01 '22 07:04 Harsh-Vavaiya

I think the problem may be related to the video file you use. What is the resolution of your input video?

You can run the demo script with this command:

python demo/top_down_video_demo_full_frame_without_det.py configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth --video-path  demo/resources/demo.mp4 --show

which will use the demo.mp4 file within mmpose, whose resolution is 960 x 540. Then check if the fps is still at a low level.

As a reference, in my device with GPU: GeForce GTX 1660 SUPER , CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz, and Ubuntu 16.04, using GPU to run the above demo script with the default arguments, the FPS is about 16.

Apr 01 '22 09:04 liqikai9

I got FPS around 5.796 using your demo video and command. and the resolution of my input video was 1920*1080.

Apr 01 '22 10:04 Harsh-Vavaiya

Is the FPS 5.796 the average value Try to get the average value of FPS if the video is long.

Apr 01 '22 11:04 liqikai9

video: demo/resources/demo.mp4 there are only 5 frames. res: 960 x 540 FPS: 0.09442310560884373 2.9375147512923356 4.566262620544298 5.263261982025393 5.796464053886348

mean FPS: around 3.726

Apr 01 '22 11:04 Harsh-Vavaiya

I got FPS around 5.796 using your demo video and command. and the resolution of my input video was 1920*1080.

Your script includes the time for data pre-processing work, which is done on the CPU. Since we have different CPU, this is not comparable. Also, I think we should use torch.cuda.synchronize() before we record the time. You can find more detail about our speed test script here

Try running this command on your machine:

python .dev_scripts/benchmark/speed_test.py

And see if the FPS of this config configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py is close to the FPS value in the inference_speed_summary

Apr 01 '22 11:04 liqikai9

Hi @Harsh-Vavaiya , is there any update on this issue?

Apr 04 '22 11:04 liqikai9

After following your instructions, I got the results it looks the same as the table but using mmpose's inference function I'm not getting FPS why this much difference?

Apr 05 '22 07:04 Harsh-Vavaiya

The FPS difference is mainly due to that your script includes the time for data pre-processing, which is done on the CPU.

We plan to complement the inference speed summary later, which will include your case. Feel free to let us know if you have any other needs to test the inference speed.

Apr 07 '22 09:04 liqikai9

mmpose mmpose copied to clipboard

Inference speed is low

mmpose
mmpose copied to clipboard