DeepSpeed
DeepSpeed copied to clipboard
Clone tensors to avoid torch.save bloat
Fixes #3303
TODOs:
- Docs
- Unit tests (?)
@stas00, please see docs https://deepspeed.readthedocs.io/en/rtd-staging/model-checkpointing.html#avoiding-zero-checkpoint-bloat
Looking at the rendering - the source formatting appears to be borked. It has :param: and the last section doesn't show up.
And the doc is hard to read as it refers to input, let me try to make a better suggestion
@stas00, thanks for the feedback. I have applied your suggestions. Please take another look.
Looking good now!