ncnn icon indicating copy to clipboard operation
ncnn copied to clipboard

CPU Performing better than GPU for Yolo Predictions

Open iravikiran opened this issue 1 year ago • 1 comments

Hello Team,

I'm having a RISC-V Dev platform which has a IMG-GPU, and I'm able to successfully build the Vulkan-NCNN Framework and performing Yolo object detection what we noticed is We're observing better CPU is taking around ~3.0 seconds for performing Yolo v8 objection detection, while GPU is taking ~6.2 seconds for performing the same object detection.

Can you please help me on this, If there's a way to optimise or improvise the performance rates on GPU over CPU. Also it'll be highly appreciated if you can help me with sharing more details on this issue.


我有一个带有 IMG-GPU 的 RISC-V 开发平台,我能够成功构建 Vulkan-NCNN 框架并执行 Yolo 对象检测,我们注意到我们观察到更好的 CPU 执行 Yolo v8 对象检测大约需要约 3.0 秒,而 GPU 执行相同的对象检测大约需要约 6.2 秒。

你能帮我解决这个问题吗,如果有办法优化或提高 GPU 而不是 CPU 的性能。如果您能帮助我分享有关此问题的更多详细信息,我将不胜感激。

Regards, Ravi Kiran

iravikiran avatar Sep 17 '24 05:09 iravikiran

The low performance of IMG-GPU is a known IMG driver optimization issue. I tried to measure the peak performance of IMG-GPU in vkpeak but was unable to do so. This needs to be optimized by the driver.

nihui avatar Apr 27 '25 04:04 nihui