Phi-3CookBook Lora fine tuning phi-3.5 moe

Lora fine tuning phi-3.5 moe

Open Manikanta5112 opened this issue 5 months ago • 7 comments

Hi,

I recently fine-tuned the phi-3.5-moe-instruct model and phi-3.5-mini-instruct model using PEFT LORA. It seems the Moe model is performing way worse than 3.5 Mini Are there any specific things that need to be in mind during LORA fine-tuning with a mixture of expert models? And also during fine-tuning for Moe the validation loss is showing as No Log

Sep 08 '24 20:09 Manikanta5112

Phi-3CookBook Phi-3CookBook copied to clipboard

Lora fine tuning phi-3.5 moe

Phi-3CookBook
Phi-3CookBook copied to clipboard