Karan Dhiman

Results 2 issues of Karan Dhiman

AWS SageMaker now supports PyTorch training (single node && distributed) using Lightning (https://pytorch-lightning.readthedocs.io/en/stable/). The blogpost with the announcement will be amended to this description once it has been released. In...

**Describe the bug** DeepSpeed provides a ZeRO configuration property `overlap_comm` which according to the documentation _Attempts to overlap the reduction of the gradients with backward computation_ (Ref: https://www.deepspeed.ai/docs/config-json/). I'm noticing...

bug