平台(如果交叉编译请再附上交叉编译目标平台):

ios

Github版本:

release 2.8.1

编译方式:

xcode run ios demo。mnn.framework用的2.8.1release

test脚本输出如下：

build % python ../tools/script/testMNNFromOnnx.py /Users/yjt/Downloads/app/py模型/新模型/lightglue_new.onnx 
Dir exist
onnx/test.onnx
tensor(float)
tensor(float)
tensor(float)
tensor(float)
['matches0', 'mscores0']
inputs:
kpts0
onnx/
kpts1
onnx/
desc0
onnx/
desc1
onnx/
outputs:
onnx/matches0.txt (1, 2)
onnx/
onnx/mscores0.txt (1,)
onnx/
hw.cpufamily: 458787763 , size = 4
The device support i8sdot:1, support fp16:1, support i8mm: 0
Start to Convert Other Model Format To MNN Model..., target version: 2.8
[16:05:53] /Users/yjt/Downloads/MNN-2.8.1/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 8
[16:05:53] /Users/yjt/Downloads/MNN-2.8.1/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 17
Start to Optimize the MNN Net...
inputTensors : [ desc0, kpts0, desc1, kpts1, ]
outputTensors: [ matches0, mscores0, ]
Converted Success!
Check convert result by onnx, thredhold is 0.01
kpts0
kpts1
desc0
desc1
output: matches0
output: mscores0
matches0: (1, 2, )
mscores0: (1, )
TEST_SUCCESS

相关代码如下：输入是另一次推断的输出参数填充。

void testMNN() {
    std::vector<MNN::Express::VARP> _mnnInputs;
    _mnnInputs.emplace_back(mnnOutput0[2]);
    _mnnInputs.emplace_back(mnn0);
    _mnnInputs.emplace_back(mnnOutput1[2]);
    _mnnInputs.emplace_back(mnn1);
    
    /*
    MNN::ScheduleConfig sConfig;
    sConfig.type = MNN_FORWARD_OPENCL;
    sConfig.numThread = 4;
    std::shared_ptr<MNN::Express::Executor::RuntimeManager> rtmgr(MNN::Express::Executor::RuntimeManager::createRuntimeManager(sConfig), MNN::Express::Executor::RuntimeManager::destroy);
    rtmgr->setCache(".cachefile");
    */
    MNN::Express::Module::Config mdconfig; // default module config
    mdconfig.shapeMutable = false;
    std::unique_ptr<MNN::Express::Module> mnnMdule(MNN::Express::Module::load({ "desc0", "kpts0", "desc1", "kpts1"}, {"matches0", "mscores0"}, model_file.c_str(), nullptr, &mdconfig));
    auto outputs  = mnnMdule->onForward(_mnnInputs);
}

假若把上面的注释打开。也就是设置RuntimeManager。并把type设置为MNN_FORWARD_OPENCL，则推断比pytorch快一点点。但是感觉还是不达预期。之前用别的模型推测图片，mnn耗时是pytorch的1/9左右。

Apr 18 '24 02:04 jtyan123

模型已经发送120543985邮箱

Apr 18 '24 02:04 jtyan123

你ios上怎么开启 opencl 的？是模拟器么?

Apr 18 '24 05:04 jxt1234

ios 上一般 gpu 用 MNN_FORWARD_METAL

Apr 18 '24 05:04 jxt1234

不知道是否真的启用了opencl。就是把type属性设置为MNN_FORWARD_OPENCL了。把type属性设置为MNN_FORWARD_METAL后MNN推断耗费的时间更长了。比pytorch多1s左右

Apr 18 '24 05:04 jtyan123

是用的真机iphone15plus跑的。不是模拟器

Apr 18 '24 05:04 jtyan123

你 mnn 是怎么编译的?

Apr 19 '24 03:04 jxt1234

我直接下载你们的release 2.8.1上的ios framework也是同样的现象。

Apr 22 '24 02:04 jtyan123

你测试方式是什么？一般是需要第二次 forward 开始计时，连续运行多次。参考 project/ios/Playground 和 tools/cpp/ModuleBasic.cpp 里面的速度测试

Apr 23 '24 06:04 jxt1234

Marking as stale. No activity in 60 days.

Jun 22 '24 09:06 github-actions[bot]

mnn推断比pytorch推断耗时长

平台(如果交叉编译请再附上交叉编译目标平台):

Github版本:

编译方式: