KTO (unpaired) Support
⚠️ Please check that this feature request hasn't been suggested before.
- [X] I searched previous Ideas in Discussions didn't find any similar feature requests.
- [X] I searched previous Issues didn't find any similar feature requests.
🔖 Feature description
The original HALOs repo supports the full version of KTO that allows for imbalanced data and more stable training.
See trainer here: https://github.com/ContextualAI/HALOs/blob/6333a8f03c5c12c0a0b791e083904eda47a5b96c/trainers.py#L758
This library currently only supports KTO training on pairs using the less stable version of the loss, called SimpleKTO in the original repo.
✔️ Solution
The KTOTrainer in the original repo with the full loss, as described in the attached image (note that there is no backpropagation through the KL term; it is used only for saturation):
❓ Alternatives
No response
📝 Additional Context
No response
Acknowledgements
- [X] My issue title is concise, descriptive, and in title casing.
- [X] I have searched the existing issues to make sure this feature has not been requested yet.
- [X] I have provided enough information for the maintainers to understand and evaluate this request.
@winglian given that follow-up studies have found unpaired KTO has shown to outperform DPO/IPO/CPO on various tasks (https://www.semanticscholar.org/paper/Insights-into-Alignment%3A-Evaluating-DPO-and-its-Saeidi-Verma/db407c3a60c6dc768fde8dd1088dab3be951f04e), would it be possible to add support for it in axlotl?
The TRL implementation of KTO is now stable, in case a reference other than the original REPO (https://github.com/ContextualAI/HALOs) is needed.