alignment-handbook
alignment-handbook copied to clipboard
DPO alignment doesn't work on Lora models as suggested
However, when I try the Lora model DPO-aligned LLM that you have trained, alignment-handbook/zephyr-7b-dpo-lora, I experience a total performance degradation.
Here is an example of model output that seems confused:
Even the training loss indicates that the model has not learned much
Here is the training loss for the full model DPO alignment.
Would you please do a clarification? Is my observation different from what you have experienced?
Thanks