MS-AMP icon indicating copy to clipboard operation
MS-AMP copied to clipboard

Remove model_state.use_fp8_ddp and optimizer.all_reduce_grads

Open wkcn opened this issue 1 year ago • 1 comments

Description The argument model_state.use_fp8_ddp is deprecated. In MS-AMP examples, all of model_state.use_fp8_ddp are set to True. Besides, the function optimizer.all_reduce_grads has not been used.

Major Revision

  • Remove model_state.use_fp8_ddp
  • Remove optimizer.all_reduce_grads
  • Remove the related unittests
  • Update the unittest test_fp8linear_backward since the type of weight gradient is torch.Tensor when model_state.use_fp8_ddp is True.

wkcn avatar Dec 14 '23 09:12 wkcn

In MS-AMP-Examples, we used optimizer.all_reduce_grads. We need to remove it from examples.

tocean avatar Dec 18 '23 10:12 tocean