mllm Vision model engine can't run on android / local

First of All, thank you to the developer team for breakthrough on llm.

I tried to run phi 3 vision from huggingface (https://huggingface.co/mllmTeam/phi-3-vision-instruct-mllm), but when I run on android platform, I got this following error:

[/home/briancqi/Documents/MLLM/mllm/src/ 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.057 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.058 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.059 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.060 15546-30489 MLLM 2024-12-05 13:15:10.061 15546-30489 MLLM 2024-12-05 13:15:10.061 15546-30489 MLLM 2024-12-05 13:15:10.061 15546-30489 MLLM 2024-12-05 13:15:10.061 15546-30489 MLLM .... (until layer 35) ParamLoader.cpp:136] language_model.model.embed_tokens.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] vision_embed_tokens.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] vision_embed_tokens.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.post_attention_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.mlp.dense_h_to_4h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.mlp.dense_h_to_4h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.mlp.dense_4h_to_h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.0.mlp.dense_4h_to_h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.post_attention_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.mlp.dense_h_to_4h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.mlp.dense_h_to_4h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.mlp.dense_4h_to_h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.1.mlp.dense_4h_to_h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.post_attention_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.mlp.dense_h_to_4h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.mlp.dense_h_to_4h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.mlp.dense_4h_to_h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.2.mlp.dense_4h_to_h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.post_attention_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.mlp.dense_h_to_4h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.mlp.dense_h_to_4h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.mlp.dense_4h_to_h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.3.mlp.dense_4h_to_h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.post_attention_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.mlp.dense_h_to_4h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.mlp.dense_h_to_4h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.mlp.dense_4h_to_h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.4.mlp.dense_4h_to_h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.post_attention_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.mlp.dense_h_to_4h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.mlp.dense_h_to_4h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.mlp.dense_4h_to_h.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.5.mlp.dense_4h_to_h.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.input_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.input_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.query_key_value.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.query_key_value.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.q_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.q_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.k_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.k_layernorm.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.dense.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.self_attn.dense.bias not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.post_attention_layernorm.weight not found org.saltedfish.chatbot E [/home/briancqi/Documents/MLLM/mllm/src/ParamLoader.cpp:136] language_model.model.layers.6.post_attention_layernorm.b

I also tried with other vision model and got same error. Any solution would be helpful for me. Steps to reproduce:

Clone https://github.com/lx200916/ChatBotApp/tree/e8dff3a90c2f969c927f60c6dd748ea37ab1460d
Replace the fuyu vision model with phi3 vision mllm from storage/download/model, vocab is already included
Run on android studio (the chat model is working)

Dec 05 '24 04:12 brianestadimas

ChatBotApp does not currently support Phi3v. For details on supported features and models, please refer to the following link https://github.com/lx200916/ChatBotApp/tree/e8dff3a90c2f969c927f60c6dd748ea37ab1460d?tab=readme-ov-file#supported-functions

You can execute phi3v in command line. However, there are still bugs in the implementation of phi3v, and it requires a phone with 24GB of memory to run.

Dec 05 '24 04:12 chenghuaWang

Hello @chenghuaWang thank you for your response. I was just wondering if I can contribute to this project and make a PR later on, by enabling support to other vision models which are smaller. However, I kind of need help of the location to change, because it seems the model architecture is hard coded somewhere in org packages or mllm packages. Just small demonstration on how you plug the Fuyu quantized model into the android app, would be really helpful for me.

As additional, the mllm developers has added support and cpp scripts for phi3 vision model and others. Thank you.

Dec 06 '24 07:12 brianestadimas

I'm glad to hear that you're interested in contributing code to the mllm project. PRs are always welcome! What kinds of visual model would you like to implement?

Just small demonstration on how you plug the Fuyu quantized model into the android app, would be really helpful for me.

for android side, cc @lx200916 can give u some help.

Dec 06 '24 08:12 chenghuaWang

@chenghuaWang @lx200916 actually, after research and trial on some different vision language models (such as minicpm, fuyu, phi vision, imp and others), these models mostly are not suitable for android device in ram consumption, therefore I want to do more quantization and pruning on these models, and try to plug in into android code. But first I want to try to plug in the phi 3 vision model from mllm team HF into the android app, I got the error on some layers because it seems the package is hardcoded only to support fuyu model.

It would be helpful for me to know the hardcoded package into the fuyu model, because the mllm team just release phi 3 vision cpp configurations.

Dec 06 '24 09:12 brianestadimas

Since this is an issue involving Android support models, Lee @lx200916 will follow up on this issue and may be able to provide some help. The number of Tokens for phi3v is quite large. TTFT will be relatively slow, and currently, we are developing NPU capabilities and researching corresponding pruning algorithms.

Dec 06 '24 11:12 chenghuaWang

Hello, if you want to migrate a supported model to the Android Demo, the main steps include:

Pay attention to the file names and related enumeration values in the initStatus function in app/src/main/java/org/saltedfish/chatbot/viewModel.kt (Since you are only replacing the Fuyu model and not adding a new one, you can skip this step).
Focus on the https://github.com/UbiquitousLearning/mllm/blob/main/tools/jni/LibHelper.cpp file, where the actual model inference code for Android is contained. You need to modify the Fuyu-related code by following the example in https://github.com/UbiquitousLearning/mllm/blob/main/examples/demo_phi3v.cpp, including the C++ class name for model loading and inference-related code. You need to use callbacks to return generated tokens to the Android UI layer.
Recompile the Android program. Refer to the README for specific steps.

Dec 06 '24 15:12 lx200916

Hi Lee @lx200916 Thank you for pointing out the changes. After some errors on trial, I have successfully built the cmake based on changes on LibHelper.cpp.

However, when I tried again on Android run, I still got same error message

As a note, I have configured everything in JNI bridge as well and LibHelper

It seems like I must have missed some steps. Therefore, I kindly need your assist on this. Deepest thanks

Dec 07 '24 11:12 brianestadimas

All codes seem fine. Maybe you need to clean the build cache of gradle? Remove the app/build folder and try to build again.

Dec 07 '24 11:12 lx200916

I think there is problem in connection, the kotlin code still trying to load image from processing_fuyu instead of phi, although I already change the jnibridge modeltype and initialization. I also tried to clear all cache and gradles as you suggested

Dec 07 '24 11:12 brianestadimas

Did you replace the libmllm.a in Android project with newly built one? Since Kotlin codes do not care about anything about model, they just pass the filename and modeltype( a int32 enum) to jni.

Dec 07 '24 11:12 lx200916

Thank you @lx200916 I think I've missed that last step, and after my trial, the code went to error

I think the problem with the code is that, I have to define the processor as static, because it can't be assigned to preprocessor (the class is not available), for phi3v. If you have some time, would you like to review the code and give some input, it would be helpful for me.

https://github.com/brianestadimas/LibHelper/blob/main/LibHelper.cpp

Dec 07 '24 13:12 brianestadimas

First, I do not recommend adding enumeration values as this can lead to changes in memory layout and confusion in loading other models. I suggest you only modify the original Fuyu code, replacing the Fuyu-related file and weight names with Phi3v, and replacing the inference code as well.

Second, given that you have modified the number and names of enumeration values, it is advisable to replace the LibHelper header file in the Android project to ensure a consistent memory layout.

After making these modifications, clear the cache and recompile to check if these errors are resolved.

Dec 07 '24 15:12 lx200916

This is my full code: https://github.com/brianestadimas/LibHelper/blob/main/LibHelper.cpp

Dec 09 '24 06:12 brianestadimas

Hi @lx200916 the previous model load is already work, but I just encounter one more problem When I debug, the LOGI("Step 2 Done"); won't appear Do you have any suggestion? Seems the auto result = (*module_)(input_tensor); Is not working, or taking too long time

Dec 10 '24 09:12 brianestadimas

Hi Lee @lx200916 @chenghuaWang I would like to follow up previous issue, any help will be grateful for me

Dec 11 '24 02:12 brianestadimas

Hi,

For larger multimodal models, inference may indeed take significantly longer, especially for larger images. You can try waiting longer or checking CPU usage to determine if inference is in progress.
Our preprocessing mechanism may not be fully compatible with images that have transparency (you can ignore this if your images are in JPEG format).

So, I suggest to wait a longer time and check if any error.

Dec 11 '24 03:12 lx200916

@chenghuaWang @lx200916 I have successfully integrated Phi 3 vision and Smollm into the app

Dec 12 '24 11:12 brianestadimas

Yes for the inference it took long time

Dec 12 '24 11:12 brianestadimas

@chenghuaWang @lx200916 I have successfully integrated Phi 3 vision and Smollm into the app

Congratulations, please submit a PR. cc @yirongjie will check and merge it.

Dec 12 '24 12:12 chenghuaWang

Thanks @chenghuaWang I will clean the code, it is pretty messed up

Dec 12 '24 12:12 brianestadimas

Actually @chenghuaWang there is still memory problem on vision model when running on android, as @lx200916 mentioned before, atleast it requires 24gb of ram to run on fuyu, and with phi 3 vision, the model can run only in approximate 10gb, but it is very slow. Since I got the familiarity of mllm model, I want to try to deploy QWEN 2B vision model with quantization awareness, but I will need a support to do that, since I have to replicate the architecture from the model

Dec 13 '24 07:12 brianestadimas

The mllm currently has not implemented the visual models of the Qwen2-VL series. The Qwen2-VL model has many innovative features, which require changes to the operator library of mllm. We are actively supporting different VLMs, and Qwen2 is also on our to-do list. If you want to support the Qwen2-VL model, you can refer to the implementations in src/models/fuyu and src/models/llava. In particular, for Qwen2-VL, you will need to support the Multi Model RoPE operator.

Dec 14 '24 08:12 chenghuaWang