DakeQQ issues

Results 6 issues of


                                            DakeQQ

Android Deployment Recommendation

您好，推荐一个基于ONNX Runtime的安卓布署项目，Depth Anything-Small (resized) 在2*A76核心上，能跑出11FPS的成绩，8Gen2能达到22FPS. Hello, We recommend an Android deployment project based on ONNX Runtime. The Depth Anything-Small model (resized) achieves 11FPS on dual A76 cores and hits 22FPS...

[Feature Request]: 安卓布署方案 Android deployment

### Feature request / 功能建议 https://github.com/DakeQQ/Native-LLM-for-Android 您好，推荐一个基于ONNXRuntime的安卓LLM布署项目，使用华为P40能跑出**5.2 token/s**, 8Gen2能跑**8.5 token/s**的成绩（_q8f32 & 786滑动窗口上下文)_. 并且可以期待未来ONNXRuntime更新**q4f16**后，速度可能再提升50%. Hello, recommend an Android LLM deployment project based on ONNXRuntime that achieves **5.2 tokens/s** on Huawei P40...

feature

[Mobile] QNN HTP Backend Setup on Android Device

### Describe the issue I am trying to run the QNN HTP backend on my Android device, but I repeatedly encounter the following error: `[E:onnxruntime:, qnn_execution_provider.cc: 513 GetCapability] QNN SetupBackend...

api:Java

platform:mobile

[Useful Link] Deploy YOLOv9 on Qualcomm NPU-HTP

Hi~ I've accomplished the successful deployment of YOLOv9 on the Qualcomm 8Gen2 NPU, hitting an astounding 47FPS with v9-C and surpassing 100FPS with v9-T. Eager to share these outstanding achievements...

[Bug] HuggingfaceTokenizer Crash

您好, 当执行tokenizer.cpp第627行, HuggingfaceTokenizer->encode_.at(s), 会因为encode里面存放的每一条string最后位都带有“\r”而回报no key found. 同样在decode_.at(id)也会找不到key. 我目前是在第498行下方新增“line.pop_back();"暂时解决这问题. 还麻烦您看看咋回事. 謝謝您~

CPUAttention.cpp 和 KVCacheManager.cpp 问题

您好，MNN团队，我们正在尝试复现MNN-LLM项目，特别是关注`llm.cpp`中的源代码。我们已使用`llmexport.py`导出了MNN模型，并模仿了`llm.cpp`中的加载和推理过程。 ```cpp // 加载部分 mMeta = std::make_shared(); runtimeManager->setHint(MNN::Interpreter::HintMode::QKV_QUANT_OPTIONS, 0); // Turn it off to ensure no accuracy loss during the test. runtimeManager->setHintPtr(MNN::Interpreter::HintMode::KVCACHE_INFO, mMeta.get()); // 推理部分 mMeta->add = ids_len; std::vector...

User