PiPPy icon indicating copy to clipboard operation
PiPPy copied to clipboard

Pipeline Parallelism for PyTorch

Results 123 PiPPy issues
Sort by recently updated
recently updated
newest added

Authoring and testing sharded RNG operations needs love. * Setting the seed, via `torch.manual_seed(seed)` does not dispatch to shards. * Constructing ops so that they'll produce the _same_ RNG choices,...

good first issue
huggingface
PiPPy

good first issue
huggingface
PiPPy

good first issue
huggingface
PiPPy

This depends on issue https://github.com/pytorch/pytorch/issues/85234 It's hard to debug since the abort doesn't generate stack trace or any exceptions that can be caught.

tanh is part of the dtensor_lagging_op_db but is also in xfail() When I adding support for tanh I don't remember making any modifications other than tests. Figure out what is...

SPMD

Add support for backward() in test_dtensor_ops.py since that will cover FW + BW.

SPMD

As part of FSDP+TP integration we use construct Tensors using TensorInfo.from_tensor which calls is_pinned: I'm getting the following error with it: ``` File "/data/home/kumpera/repos/PiPPy/spmd/spmd/tensor/dispatch.py", line 174, in operator_dispatch raise RuntimeError(...

good first issue
SPMD

``` -- Process 0 terminated with the following error: Traceback (most recent call last): File "/data/home/kw2501/repos/PiPPy/PiPPy/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/data/home/kw2501/repos/PiPPy/pippy/utils.py", line 107, in run_worker run_master(pp_ranks_per_dp_group[rank], args,...

enhancement

1. Create a branch `hf_example_summarization` 2. Copy [summarization](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization) dir to [hf](https://github.com/pytorch/PiPPy/tree/main/examples/hf) 3. Add files, commit, create a PR(`hf_example_summarization`->`main`) 4. Create another branch `hf_example_summarization_pippy` on top of `hf_example_summarization` and all PiPPy...

huggingface