stylable
stylable copied to clipboard
Is model parallelism supported for PyTorch?
If I write my own multi-GPU model or use torch.distributed.pipeline.sync.Pipe
, would multi-node training still work with byteps?
We are working on supporting model parallelism. For now, you can still use BytePS to optimize the allreduce
primitive in your code.