MOSS-RLHF The release of reward model training code?

The release of reward model training code?

Open hejujie opened this issue 1 year ago • 3 comments

It's an excellent paper and has a significant contribution to the entire alignment community. I wanted to inquire if you have any plans to open-source the training code for the reward model.

Jul 12 '23 13:07 hejujie

Thank you for your great supports to us! Because reward model training involves more methods, this part will be explained in the second part of the technical report, thank you for your support and recognition!

Jul 13 '23 02:07 Ablustrund

Thank you for your response. May I know if there is a specific timeline for the release of the second part of the technical report?

Jul 13 '23 05:07 hejujie

Probably in August or September of this year, thanks for your interest.

Jul 13 '23 14:07 Ablustrund

MOSS-RLHF MOSS-RLHF copied to clipboard

The release of reward model training code?

MOSS-RLHF
MOSS-RLHF copied to clipboard