Open-Assistant
Open-Assistant copied to clipboard
add training code for reward model
trafficstars
trainer code to train a single score reward model. Currently support webgpt and raw datasets from humanfeed back summary by openai. See readme and rank_datasets.py for more details.
@yk yeah, it's my problem. just reset the format setting