llama-recipes icon indicating copy to clipboard operation
llama-recipes copied to clipboard

DPO Fine-tuning

Open jens5588 opened this issue 9 months ago • 1 comments

🚀 The feature, motivation and pitch

Is it possible to adapt the fine-tuning script for DPO finetuning? The current version seems to only work for next token prediction fine-tuning.

Alternatives

No response

Additional context

No response

jens5588 avatar May 07 '24 15:05 jens5588

Thanks for the feedback! We are working on some examples and will let you know once they are integrated!

init27 avatar Aug 19 '24 18:08 init27