MOSS-RLHF icon indicating copy to clipboard operation
MOSS-RLHF copied to clipboard

The release of reward model training code?

Open hejujie opened this issue 1 year ago • 3 comments

It's an excellent paper and has a significant contribution to the entire alignment community. I wanted to inquire if you have any plans to open-source the training code for the reward model.

hejujie avatar Jul 12 '23 13:07 hejujie

Thank you for your great supports to us! Because reward model training involves more methods, this part will be explained in the second part of the technical report, thank you for your support and recognition!

Ablustrund avatar Jul 13 '23 02:07 Ablustrund

Thank you for your response. May I know if there is a specific timeline for the release of the second part of the technical report?

hejujie avatar Jul 13 '23 05:07 hejujie

Probably in August or September of this year, thanks for your interest.

Ablustrund avatar Jul 13 '23 14:07 Ablustrund