ps_pytorch
ps_pytorch copied to clipboard
implement distributed machine learning with Pytorch + OpenMPI
Then it's very time consuming
Hi wang, I want to setup ps-pytorch in my own cluster, but the guide only mentioned how to setup it in AWS. Can you give me some guidance about this?...
Hi wang, I'm just wondering why to convert the gradient Tensor into `float64`, I thought they might be just `float32`. And it should be more accurate than SGD required. https://github.com/hwang595/ps_pytorch/blob/89a1cfa136b957073576fae827d39ef0fb09d2fc/src/distributed_worker.py#L258