DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

[chat] generate process is not a single step in RL

Open ht-zhou opened this issue 1 year ago • 2 comments

Hi I am from the ColossalAI team. I found that there are similarities between DeepSpeedChat and ColossalChat. We found that there might be some implementation error in our code, thus possibly leading to some convergence problems. If you refer to our implementation, you may encounter similar problems as well. We are planning to fix the bug these two weeks.

Meanwhile, it would be appreciated if you could reference our work if you adapted from our implementation.

ht-zhou avatar Apr 12 '23 02:04 ht-zhou