PiPPy issues

Split each layer in multiple gpu

I tried the example_train.py with distribuited pytorch and it work well. Just the example split the network layer by layer and assign each one to a single machine. Is possible...

EnricoBeltramo

Make RemoteInterpreter use the full implementation of `Interpreter.run`

2

https://github.com/jamesr66a/PiPPy/blob/527af1fd8123d35bd81b9fe304a8d0ed29c9fd8d/pippy/PipelineDriver.py#L565 I wrote `run_until` as a hack, it should probably be a copy-paste of `Interpreter.run` with some termination branch inside (or we should refactor `Interpreter.run` to make implementing like this...

jamesr66a

good first issue

mid-pri

Support LayoutLM models in HF tests

1

pbelevich

good first issue

huggingface

PiPPy

Request for Examples of Pipeline Parallelism with Multiple Machines in PiPPy

1

I would like to use PiPPy for distributed inference with multiple machines and multiple GPUs. However, most of the test cases in the repository are for single-machine testing. Can you...

littlefatfat

TP+PiPPy failing on HF examples.

4

installing from src and PT nightlies, trying to add TP to the HF inference example its failing with `RuntimeError: aten.add.Tensor: got mixed distributed and non-distributed tensors` I am wondering if...

HamidShojanazeri

How to run the gpt2 example on a single node with four GPU?

I am trying to reproduce the [gpt2 example](https://github.com/pytorch/PiPPy/tree/main/examples/hf/gpt2) in a single node without slurm for some performance metrics, but the code only provides slurm scripts. How should I modify the...

lsder

Could pippy be coexisted with deepspeed?

1

Hi, I want to know whether I could use pippy's pp capability with deepspeed's zero3 config? So that it together lead to 3d parallism? Thx

leiwen83

Incorrect loss value of huggingface bert example

Hi~ Thanks for your nice repo! Steps to reproduce the bug: 1. change the original examples/hf/bert/pippy_bert.py to the following: ```python # Copyright (c) Meta Platforms, Inc. and affiliates import argparse...

pinxuezhao

How to reduce memory costs when running on CPU

I running HF_inference.py on my CPU and it works well! It can successfully applying pipeline parallelism on CPU. However, when I applying pipeline parallelism, I found that each rank will...

jiqing-feng

use torchDynamo instead of tracing to get graph?

3

hi guys, this project is awesome comparing with torch pipe, I have some models that are [not supported by tracing](https://pytorch.org/docs/stable/fx.html#limitations-of-symbolic-tracing). Would you guys have plan to support [torchDynamo](https://github.com/pytorch/torchdynamo) to get...

Jack47

enhancement

PiPPy

PiPPy
PiPPy copied to clipboard

Metadata

Split each layer in multiple gpu

Make RemoteInterpreter use the full implementation of `Interpreter.run`

Support LayoutLM models in HF tests

Request for Examples of Pipeline Parallelism with Multiple Machines in PiPPy

TP+PiPPy failing on HF examples.

How to run the gpt2 example on a single node with four GPU?

Could pippy be coexisted with deepspeed?

Incorrect loss value of huggingface bert example

How to reduce memory costs when running on CPU

use torchDynamo instead of tracing to get graph?

← Metadata

Owner

Metadata

PiPPy PiPPy copied to clipboard

Metadata

← Metadata

Owner

Metadata

PiPPy
PiPPy copied to clipboard