mllm issues

CANNOT LINK EXECUTABLE "./demo_fuyu": library "libomp.so" not found: needed by main executable

3

Hi As i followed the steps to build and run on Samsung s24 android device, facing below error mllm/scripts$ ./run_fuyu.sh ../vocab/fuyu_vocab.mllm: 1 file pushed, 0 skipped. 34.1 MB/s (5854575 bytes...

sharmilamani

How to check how many tokens/sec it was generated in android app

Help me to calculate yk/s while running in android

Vinaysukhesh98

Android app crash while Image Reading

12

Fail To Load Models! Please Check if models exists at /sdcard/Download/model and restart but i have copied model in that location. still getting same error message what would be the...

Vinaysukhesh98

bug

为什么预填充和解码不能都在 NPU 上运行？

4

我看了下代码，我的理解是prefill做的是预处理部分的工作，主要的推理是在decode部分完成，为什么代码里面是把prefill放在了npu上去执行，而重要的decode阶段要放在CPU上去执行？

yhwang-hub

Prefill speed is approximately 4~6 tokens/s for Qwen1.5-1.8B

5

Hi, mllm-qnn can work on my device oppo findx7 ultra(snapdragon 8gen 3+16G RAM). However, the prefill speed for Qwen1.5-1.8B is approximately 4-6 tokens per second, which significantly diverges from the...

mengllm

Crash on Xiaomi 14(8gen3) with QNN

1

When we run the main_gwen_npu on xiaomi14, it has the follow crash log: ![WechatIMG16036](https://github.com/user-attachments/assets/ff87f385-867b-4be0-bbcd-b019f780eb2a)

zhuipiaochen

Segmentation fault on OPPO FindX7 Ultra (Snapdragon8Gen3)

2

DDR size = 16GB ./main_qwen_npu -s 64 -c 1 -l 512 below is tail logs ` Memory Usage: 8910 MB(19036) at: execute graph: 94 chunk:1 execute qnn graph 95 model.layers.23.self_attn.or_split...

bingo787

Android crashed and forcely rebooted when executing main_qwen_npu

9

Hello, I've execute `main_qwen_npu` folloing the [guideline](https://github.com/UbiquitousLearning/mllm/tree/main/src/backends/qnn). In fact, there were minor bugs so I've manually fixed them. (e.g., missing `adb push ../vocab/qwen_merges.txt ...`). When I ran `main_qwen_npu`, Android crahsed...

taegeonum

Is Subgraph Heterogeneous Compute Available in MLLM?

2

Hello author, I am doing research on LLM heterogeneous computation. When I was browsing the code, I noticed that MLLM's Net class has some content about subgraph. My question is,...

MaTwickenham

feat: add_profilling_activation

add_profilling_activation

chunfenri

mllm
mllm copied to clipboard

Metadata

CANNOT LINK EXECUTABLE "./demo_fuyu": library "libomp.so" not found: needed by main executable

How to check how many tokens/sec it was generated in android app

Android app crash while Image Reading

为什么预填充和解码不能都在 NPU 上运行？

Prefill speed is approximately 4~6 tokens/s for Qwen1.5-1.8B

Crash on Xiaomi 14(8gen3) with QNN

Segmentation fault on OPPO FindX7 Ultra (Snapdragon8Gen3)

Android crashed and forcely rebooted when executing main_qwen_npu

Is Subgraph Heterogeneous Compute Available in MLLM?

feat: add_profilling_activation

← Metadata

Owner

Metadata

mllm mllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

mllm
mllm copied to clipboard