DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

In instructGPT, during the RM training process, different <prompt, response> pairs of a prompt are put together to calculate the loss. Is this also implemented in DeepSpeed-chat?

Open BaiStone2017 opened this issue 1 year ago • 0 comments

BaiStone2017 avatar Apr 17 '23 01:04 BaiStone2017