Samyam Rajbhandari

Results 3 comments of Samyam Rajbhandari

@szhengac You are correct, LAMB and LARS implementations that are not aware of ZeRO will not work correctly with ZeRO. This is not a fundamental limitation of optimizer partitioning though,...

Hi Nathan, Thank you for trying out DeepSpeed. I am a researcher in the DeepSpeed team. I wanted to share a few comments here that might be helpful: Small Models:...

@champson, @yefanhust are there specific models/scenarios you are looking to apply pipeline parallelism for. The scenarios that PP is helpful for inference is very narrow, and applicable in just a...