alignment-handbook
alignment-handbook copied to clipboard
Add `scripts/run_kto.py`
Description
As briefly discussed with @lewtun this morning, this PR adds the scripts/run_kto.py
script to fine-tune LLMs using the trl.KTOTrainer
from the alignment-handbook
.
The script should work as is, but still needs to be tested, cc in case you're interested @jiwooya1000 and @nlee-208
The main reference used to put the script together has been https://github.com/huggingface/trl/blob/main/examples/scripts/kto.py.