LLaMA-Factory
LLaMA-Factory copied to clipboard
请问是否会在框架内集成RLOO算法,最新的online RLHF?
Reminder
- [X] I have read the README and searched the existing issues.
System Info
论文地址:https://arxiv.org/abs/2402.14740 huggingface trl 库实现地址:https://github.com/huggingface/trl/blob/main/trl/trainer/rloo_trainer.py
Reproduction
None
Expected behavior
No response
Others
No response