mllm issues

Can mllm support MTK APU via NeuroPilot SDK?

2

Hello developers,I have noticed that you already support QNN backend,what a excellent work! Do you have any plan to support MTK APU via its NeuroPilot SDK?

FranzKafkaYu

Error on Xiaomi 14(8gen3) with QNN

QNN alloc size: 4194304 QNN alloc size: 262144 model.layers.0.self_attn.ires_split-00_view_is QNN INT8 op 0.0ms [ ERROR ] Tensor name InceptionV3_InceptionV3_Conv2d_1a_3x3_Conv2D_stride already exists in the graph. [ ERROR ] QnnModel::addTensor() Creating tensor...

ustcsq

When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char

4

I fine-tune the Gemma2 2B Instruction with BitsAndBytes（int4). It works when test with the transformer. Then I follow the guide to build the mllm and quantize the model for linux....

chenminjun-web

APk BUild issues

1

[Issue while building apk ](https://github.com/lx200916/ChatBotApp/issues/3#issue-2489257920) without qnn build as well getting same kind of error.

Vinaysukhesh98

Failed to allocate memory error on Galaxy S24 NPU

1

Hello, Thank you for sharing your valuable code with the community. After compiling and copying all the files, I was trying to run the qwen npu model on Galaxy S24...

gingerly

The Llama 7B model works on my Android phone, but other models do not.

2

Hello, I've been asking a lot of questions today. After building the Android phone app I created as an example and installing it on a Galaxy S22 model with 12GB...

siz0001

Compile .so for Android

1

Can these examples be built as .so files that can be used in the python code on Android?

KAIWEILIUCC

Using Qwen-2.0

2

I am trying to use Owen-2.0 in mllm. I converted the model, vocab using the given tools. However, the outputs of Owen-2.0 was garbled. Do I need to do any...

KAIWEILIUCC

How did you obtain the two model files, qwen-1.5-1.8b-chat-int8.mllm and qwen-1.5-1.8b-chat-q4k.mllm?

3

How did you obtain the two model files, qwen-1.5-1.8b-chat-int8.mllm and qwen-1.5-1.8b-chat-q4k.mllm?

yhwang-hub

关于OpPackage-LLaMAAdd中的Q6_V_valign_VVR计算方式的疑惑

你好在LLaMAAdd.cpp中的如下函数 ``` int32_t hvx_add_af( float *restrict input, float *restrict input2, float *restrict output, uint32_t size) { ... sline1 = Q6_V_valign_VVR(sline1c, sline1p, (size_t)input); sline2 = Q6_V_valign_VVR(sline2c, sline2p, (size_t)input2); ... }...

mailonghua

mllm
mllm copied to clipboard

Metadata

Can mllm support MTK APU via NeuroPilot SDK?

Error on Xiaomi 14(8gen3) with QNN

When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char

APk BUild issues

Failed to allocate memory error on Galaxy S24 NPU

The Llama 7B model works on my Android phone, but other models do not.

Compile .so for Android

Using Qwen-2.0

How did you obtain the two model files, qwen-1.5-1.8b-chat-int8.mllm and qwen-1.5-1.8b-chat-q4k.mllm?

关于OpPackage-LLaMAAdd中的Q6_V_valign_VVR计算方式的疑惑

← Metadata

Owner

Metadata

mllm mllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

mllm
mllm copied to clipboard