Hao Chen
Results
1
issues of
Hao Chen
https://github.com/ContextualAI/HALOs 的工作提到KTO的效果优于DPO和PPO,且不需要paired dataset
enhancement
pending