DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

modify the quantize.py file for efficiency

Open yaozhewei opened this issue 4 years ago • 1 comments

  1. Update the calculate from torch.split to torch.amin/torch.amax for fast computation
  2. Update stochastic rounding computation logic (faster and cleaner) a. support both sym/asym sr in pytorch level b. reduce the new tensor creator from 2-->1 c. support cpu tensor as well
  3. change fp16 --> fp32 to avoid overflow issue
  4. change some other logic for easy understanding

yaozhewei avatar Nov 04 '21 06:11 yaozhewei

Can one of the admins verify this patch?

rocm-mici avatar Jun 09 '22 20:06 rocm-mici

Stale PR. quantize.py is quite different now. These changes are no longer relevant, therefore closing the PR.

mrwyattii avatar Aug 23 '23 21:08 mrwyattii