KTO (unpaired) Support

Open kawine opened this issue 1 year ago • 1 comments

⚠️ Please check that this feature request hasn't been suggested before.

[X] I searched previous Ideas in Discussions didn't find any similar feature requests.
[X] I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

The original HALOs repo supports the full version of KTO that allows for imbalanced data and more stable training.

See trainer here: https://github.com/ContextualAI/HALOs/blob/6333a8f03c5c12c0a0b791e083904eda47a5b96c/trainers.py#L758

This library currently only supports KTO training on pairs using the less stable version of the loss, called SimpleKTO in the original repo.

✔️ Solution

The KTOTrainer in the original repo with the full loss, as described in the attached image (note that there is no backpropagation through the KL term; it is used only for saturation): Screen Shot 2024-01-24 at 8 34 44 PM

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this feature has not been requested yet.
[X] I have provided enough information for the maintainers to understand and evaluate this request.

Jan 25 '24 04:01 kawine

@winglian given that follow-up studies have found unpaired KTO has shown to outperform DPO/IPO/CPO on various tasks (https://www.semanticscholar.org/paper/Insights-into-Alignment%3A-Evaluating-DPO-and-its-Saeidi-Verma/db407c3a60c6dc768fde8dd1088dab3be951f04e), would it be possible to add support for it in axlotl?

The TRL implementation of KTO is now stable, in case a reference other than the original REPO (https://github.com/ContextualAI/HALOs) is needed.

May 15 '24 00:05 kawine