DeepSpeed
DeepSpeed copied to clipboard
Modify unit test to cover more cases
This PR adds two set of modification to the unit tests for better test coverage on optimizer functionality:
- Increase the parameters groups to catch errors better at optimizer step function.
- Reduce the loss scale by using "
initial_scale_power
" to reduce the likelihood of overflowing in the backward and the optimizer step will be called into eventually.
cc: @tjruwase, @jeffra
Can one of the admins verify this patch?