mllm
mllm copied to clipboard
When test with QLora Finetuned Gemma2 2B on Linux, it just generate a repeated char
I fine-tune the Gemma2 2B Instruction with BitsAndBytes(int4). It works when test with the transformer. Then I follow the guide to build the mllm and quantize the model for linux. But when I test the finetuned model with the example demo_gemma, it always output a repeated char(Korea char). Does any one tried this? Or is something wrong with me?