ncnn icon indicating copy to clipboard operation
ncnn copied to clipboard

iOS 使用GPU加速说是线程组问题

Open zaoqilai opened this issue 9 months ago • 3 comments

前提: ncnn使用的是 ncnn-20250503-ios , MoltenVK使用的是v1.3.0-rc1里面的

现象: 1.能编译, 如果不设置GUP就没问题,use_vulkan_compute = false 功能也正常

2.use_vulkan_compute = true会报下面的错 -[MTLDebugComputeCommandEncoder _validateThreadsPerThreadgroup:]:1276: failed assertion `(threadsPerThreadgroup.width(8) * threadsPerThreadgroup.height(3) * threadsPerThreadgroup.depth(1))(24) must be multiples of 32.'

看资料说是ncnn默认线程组是32,请问下这种是什么情况,本地没有设置metal的线程组。 这个是和模型有关联吗

zaoqilai avatar May 27 '25 06:05 zaoqilai

使用下载的yolov8n.ncnn.param, 开启GPU之后,报错如下: -[MTLDebugComputeCommandEncoder _validateThreadsPerThreadgroup:]:1276: failed assertion `(threadsPerThreadgroup.width(4) * threadsPerThreadgroup.height(4) * threadsPerThreadgroup.depth(1))(16) must be multiples of 32.'

zaoqilai avatar May 28 '25 06:05 zaoqilai

https://github.com/Tencent/ncnn/pull/2483

nihui avatar Jun 06 '25 05:06 nihui

fixed in https://github.com/Tencent/ncnn/commit/8a2eab111478ded313d2e43dd5274144ec0a65f7

nihui avatar Jun 09 '25 07:06 nihui

这个问题可以更新到官方库中?

cjlkbxt avatar Jun 25 '25 09:06 cjlkbxt

会发布到新版本中,着急测试的话可以试试 https://github.com/nihui/ncnn/releases/tag/20250912

nihui avatar Sep 15 '25 02:09 nihui