mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

[Question] While waiting for the model's response on an Android phone, performing other operations may cause the phone to become unresponsive or reboot.

Open yangshgetui opened this issue 8 months ago • 2 comments

❓ General Questions

While waiting for the model's response on an Android phone, performing other operations may cause the phone to become unresponsive or reboot. For example, if I want to return to the home screen.

Image

I suspect that it's due to insufficient GPU resources on the device. Trying to use only the CPU results in the app crashing.

2025-03-04 15:03:37.647 19380-19447/ai.mlc.mlcchat E/AndroidRuntime: FATAL EXCEPTION: Thread-5 Process: ai.mlc.mlcchat, PID: 19380 org.apache.tvm.Base$TVMError: TVMError: Assert fail: T.tvm_struct_get(p_model_embed_tokens_q_weight, 0, 10, "int32") == 4, Argument qwen2_q4f16_1_e396fd42f6a997ca798eafc3bf56647f_fused_dequantize_take1.p_model_embed_tokens_q_weight.device_type has an unsatisfied constraint: 4 == T.tvm_struct_get(p_model_embed_tokens_q_weight, 0, 10, "int32")

    at org.apache.tvm.Base.checkCall(Base.java:173)
    at org.apache.tvm.Function.invoke(Function.java:130)
    at ai.mlc.mlcllm.JSONFFIEngine.runBackgroundLoop(JSONFFIEngine.java:65)
    at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:42)
    at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:40)
    at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:19)
    at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:18)
    at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)

yangshgetui avatar Feb 13 '25 07:02 yangshgetui