OpenRLHF
OpenRLHF copied to clipboard

Published 20 hours ago •

Reame
Issues

reward model数据集问题

Open burger-pb opened this issue 2 months ago • 3 comments

我在模型微调的时候加入了代码数据集，让模型拥有不错的代码能力，在RLHF阶段训练奖励模型的时候还需要再加入代码数据集的训练吗，如果不加入会不会导致模型的代码能力下降

Apr 18 '24 05:04 burger-pb