following-instructions-human-feedback
following-instructions-human-feedback copied to clipboard

Published 20 hours ago •

Reame
Issues

Where to find the experiment comparation: Using the data of training reward model for fine-tuning without reinforcement learning.

Open guotong1988 opened this issue 1 year ago • 0 comments

Thank you very much!

Apr 18 '23 03:04 guotong1988