LLM Inference not working for Phi-2, Falcon and StableLM

Open JakobPogacnikSouvent opened this issue 10 months ago • 0 comments

Description

I'm having trouble getting the llm_inference sample for android to work using the models for Phi-2, Falcon and StableLM. When trying to use the model files generated with the llm_conversion colab (specifically falcon_gpu.bin, phi2_gpu.bin and stablelm_gpu.bin), changing the MODEL_PATH variable in the InferenceModel to use one of the specified converted models results in the app getting stuck in an infinite loop after the user's question:

The sample works fine when using any of the Gemma 2B or Gemma 2 2B models provided on kaggle.

Environment

OS: Windows 10 Mediapipe version: 0.10.16 Android device: Samsung SM-A556B Android 14.0 | arm64 Using Android Studio Koala

Jan 24 '25 13:01 JakobPogacnikSouvent