unsloth Model inference - performace drop when using unsloth

Model inference - performace drop when using unsloth

Open TomekPro opened this issue 7 months ago • 4 comments

Hi, I fine-tuned a model (yam-peleg/Experiment26-7B) using unsloth. Then during inference, model correctness drops when using unsloath FastLanguageModel. I see some modules are replaced. It looks a little bit weird that for Mistral type model LlamaRotaryEmbedding is used. Any idea if this could cause a performance drop?

OLD inference:

Unsloth way

When comparing model files I see the following differences: and this:

Jul 16 '24 12:07 TomekPro

unsloth unsloth copied to clipboard

Model inference - performace drop when using unsloth

unsloth
unsloth copied to clipboard