DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Fix RLHF loss metrics & single-gpu training script

Open li-plus opened this issue 1 year ago • 2 comments

This PR fixes:

  1. the actor/critic mean loss calculation
  2. step-3 training script for 1.3b model on single gpu
  3. some typos

li-plus avatar Apr 22 '23 17:04 li-plus

@microsoft-github-policy-service agree

li-plus avatar Apr 22 '23 17:04 li-plus