PiPPy icon indicating copy to clipboard operation
PiPPy copied to clipboard

Pipeline Parallelism for PyTorch

Results 123 PiPPy issues
Sort by recently updated
recently updated
newest added

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #1079 * #1077

cla signed

I'm assuming (but didn't debug) - because of using 1 microbatch, there is some corner case that makes the schedule hang. Repro: https://gist.github.com/wconstab/365aa5615270645c11658e28f8051e54

Currently every test defines its own example model. We should have a model registry to deduplicate those models, and the tests just fetch from it.

When running the `examples/llama/pippy_llama.py` script on two A800 GPUs, each rank is observed to consume the full model_size in memory, rather than sharing the weights across both GPUs. Additionally, the...

Would it be possible to provide a few examples on how to train a network using pipestage and pipeschedule? all the examples i have gone through so far are dedicated...

Fixing this issue depends on also first fixing https://github.com/pytorch/pytorch/issues/123411 so that the original 'split_graph' retains the 'non_persistent' attribute. ``` if is_buffer: _assign_attr( param_val, callee, param_fqn, attr_kind=_AttrKind.BUFFER, persistent=True, #

use torchtrain to showcase and update PP docs to reflect best practices

I'd like to load a pipeline stage into a VM with low disk storage. Is there a way to export a pipeline stage and have it run independently from the...

Hello, I was wondering If someone could provide an example or some guidance on how to use PiPPy for models, that will not fit on one GPU. I want to...

high-pri