RL4LMs
RL4LMs copied to clipboard
Do you have any plans to apply the recently published Reinforced Self-Training (ReST)?
Do you have any plans to apply the recently published Reinforced Self-Training (ReST)?
Reinforced Self-Training (ReST) for Language Modeling https://arxiv.org/abs/2308.08998