PiPPy icon indicating copy to clipboard operation
PiPPy copied to clipboard

Pipeline Parallelism for PyTorch

Results 123 PiPPy issues
Sort by recently updated
recently updated
newest added

https://github.com/pytorch/torchtitan/pull/161/files#diff-80b04fce2b861d9470c6160853441793678ca13904dae2a9b8b7145f29cd017aR269 IIRC @awgu mentioned there was an issue requiring this setting for the time being. Not sure why or if it has been fixed yet?

Currently have to work around by using regular `rmsnorm` for PP to be enabled ``` torch._dynamo.exc.Unsupported: Illegal getattr invocation stride in strict mode # coming from ` if dy.stride(-1) !=...

`torch.export` has strict mode and non-strict mode. For difference, please read [Non-Strict Export](https://pytorch.org/docs/stable/export.html#non-strict-export). This PR switches to non-strict mode by default. Improving tracing success rate (no Dynamo graph break).

cla signed

https://github.com/pytorch/torchtitan/pull/161/files#diff-80b04fce2b861d9470c6160853441793678ca13904dae2a9b8b7145f29cd017aR254 In principle, the issue is that the PP model code traced the non-FSDP model, and in that case, the model code ran a .to(f32) operation which was a no-op...

Add doc string for manual stage and example under `basic/` Made input_args a required argument Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #1109

cla signed

**1D** - #1108 **2D (FSDP)** - #1104 - #1105 **3D (TP)**

An automatic graph-based pipeline splitting algorithm. The goal of the method is to split the computation graph into stages to minimize the communication between the stages while trying to balance...

cla signed

How can we get back our trained model once we train using the pipe object and Gpipe Scheduler as a normal nn.Module class?

- [x] FSDP + PP - [ ] DDP + PP - [ ] DCP path