PiPPy issues

re-enable interleaved 1f1b test

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #1079 * #1077

cla signed

GPipe Schedule hangs when run with 1 microbatch

I'm assuming (but didn't debug) - because of using 1 microbatch, there is some corner case that makes the schedule hang. Repro: https://gist.github.com/wconstab/365aa5615270645c11658e28f8051e54

wconstab

Interleaved 1f1b performance

H-Huang

[Test] Create a model registry for testing

Currently every test defines its own example model. We should have a model registry to deduplicate those models, and the tests just fetch from it.

kwen2501

Unexpected Memory Usage and Latency with PP

4

When running the `examples/llama/pippy_llama.py` script on two A800 GPUs, each rank is observed to consume the full model_size in memory, rather than sharing the weights across both GPUs. Additionally, the...

Lucius-THU

Request for training examples using PipeStage and PipeSchedule

3

Would it be possible to provide a few examples on how to train a network using pipestage and pipeschedule? all the examples i have gone through so far are dedicated...

dheerj188

pippy tracer ignores 'non_persistent' attribute of buffers

Fixing this issue depends on also first fixing https://github.com/pytorch/pytorch/issues/123411 so that the original 'split_graph' retains the 'non_persistent' attribute. ``` if is_buffer: _assign_attr( param_val, callee, param_fqn, attr_kind=_AttrKind.BUFFER, persistent=True, #

wconstab

DCP + PP docs and ux

use torchtrain to showcase and update PP docs to reflect best practices

wconstab

Is there a way to export a pipeline stage?

2

I'd like to load a pipeline stage into a VM with low disk storage. Is there a way to export a pipeline stage and have it run independently from the...

nrs-status

How to use PiPPy for large models that won't fit on one GPU

5

Hello, I was wondering If someone could provide an example or some guidance on how to use PiPPy for models, that will not fit on one GPU. I want to...

aspiridon0v

high-pri

PiPPy
PiPPy copied to clipboard

Metadata

re-enable interleaved 1f1b test

GPipe Schedule hangs when run with 1 microbatch

Interleaved 1f1b performance

[Test] Create a model registry for testing

Unexpected Memory Usage and Latency with PP

Request for training examples using PipeStage and PipeSchedule

pippy tracer ignores 'non_persistent' attribute of buffers

DCP + PP docs and ux

Is there a way to export a pipeline stage?

How to use PiPPy for large models that won't fit on one GPU

← Metadata

Owner

Metadata

PiPPy PiPPy copied to clipboard

Metadata

← Metadata

Owner

Metadata

PiPPy
PiPPy copied to clipboard