alignment-handbook DPO alignment doesn't work on Lora models as suggested

DPO alignment doesn't work on Lora models as suggested

Open Abe13 opened this issue 1 year ago • 1 comments

You claim that "In practice, we find comparable performance for both full and LoRA fine-tuning, with the latter having the advantage of producing small adapter weights that are fast to upload and download from the Hugging Face Hub."

However, when I try the Lora model DPO-aligned LLM that you have trained, alignment-handbook/zephyr-7b-dpo-lora, I experience a total performance degradation. Here is an example of model output that seems confused:

Even the training loss indicates that the model has not learned much

Here is the training loss for the full model DPO alignment.

Would you please do a clarification? Is my observation different from what you have experienced?

Thanks

Dec 06 '23 19:12 Abe13

alignment-handbook alignment-handbook copied to clipboard

DPO alignment doesn't work on Lora models as suggested

alignment-handbook
alignment-handbook copied to clipboard