MNN icon indicating copy to clipboard operation
MNN copied to clipboard

GRU模型CPU后端多batch推理性能无提升

Open yklidelong opened this issue 2 years ago • 6 comments

测试了gru模型单batch与512batch模型在inter cpu后端的推理耗时,512batch耗时基本是512倍的单batch耗时,修改nthread=1,4,32推理耗时没有变化,cpu利用率为100%,400%, 3100%,请问是否符合预期?

模型为onnx导出到mnn模型,单batch模型输入shape为[1,1,47]+[2,1,64],512batch输入shape为[512,1,47]+[2,512,64]

yklidelong avatar Dec 26 '23 09:12 yklidelong

能不能把模型发我们看一下,邮箱[email protected],打包rar

v0jiuqi avatar Dec 27 '23 02:12 v0jiuqi

另外你先用ModuleBasic.out这个工具看一下哪个算子耗时最大

v0jiuqi avatar Dec 27 '23 02:12 v0jiuqi

网络权限限制,测试使用的是pytorch构造的最简单GRU模型结构,仅有nn.GRU+nn.Sigmoid+nn.Linear三层,torch.export导出到onnx后转MNN

yklidelong avatar Dec 27 '23 08:12 yklidelong

我刚刚测试了一下,Batch=512的耗时不是Batch=1的512倍呀

v0jiuqi avatar Dec 27 '23 11:12 v0jiuqi

请问是导出batch512的onnx模型后转MNN模型直接runsession, 还是导出batch1的onnx模型转MNN后对inputTensor进行resize再runsession?

yklidelong avatar Dec 28 '23 03:12 yklidelong

使用MNNV2Basic.out工具测试单线程与32线程性能: ./MNNV2Basic.out gru_b512.mnn 50 0 0 0 1 Use extra forward type: 0 Open Model gru_b512.mnn Load Cache file error. The device support i8sdot:0, support fp16:0, support i8mm: 0 test_main, 282, cost time: 0.810000 ms Session Info: memory use 1.055801 MB, flops is 0.3687006 M, backendType is 13 Input size:65536 Session Resize Done. Session Start running... Tensor shape: 2, 512, 64, fileName.str().c_str()=s ./input 0.txt in _loadInpoutFromFile, 110 output: h output: output precision:2, memory: 0, Run 50 time: Avg= 20.475580 ms, Opsum = 21.660240 ms_min= 20.360001ms, max= 21.699001 ms

./MNNV2Basic.out gru_b512.mnn 50 0 0 32 Use extra forward type: 0 Open Model gru_b512.mnn Load Cache file error. The device support i8sdot:0, support fp16:0, support i8mm: 0 test_main, 282, cost time: 1.670000 ms Session Info: memory use 1.055801 MB, flops is 0.368706M, backendType is 13 Session Resize Done. Session Start running... Tensor shape: 2, 512, 64, Input size:65536 fileName.str().c_str()=s ./input 0.txt in _loadInputFromFile, 110 output: h output: output precision:2, memory: 0, Run 50 time: Avg= 25.187241 ms, OpSum = 26.607084ms min= 25.158001 ms, max= 25.363001 ms

yklidelong avatar Dec 28 '23 06:12 yklidelong

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Feb 26 '24 09:02 github-actions[bot]