alignment-handbook icon indicating copy to clipboard operation
alignment-handbook copied to clipboard

DPO alignment doesn't work on Lora models as suggested

Open Abe13 opened this issue 1 year ago • 1 comments

You claim that "In practice, we find comparable performance for both full and LoRA fine-tuning, with the latter having the advantage of producing small adapter weights that are fast to upload and download from the Hugging Face Hub."

However, when I try the Lora model DPO-aligned LLM that you have trained, alignment-handbook/zephyr-7b-dpo-lora, I experience a total performance degradation. Here is an example of model output that seems confused: image

Even the training loss indicates that the model has not learned much image

Here is the training loss for the full model DPO alignment. image

Would you please do a clarification? Is my observation different from what you have experienced?

Thanks

Abe13 avatar Dec 06 '23 19:12 Abe13