PiPPy icon indicating copy to clipboard operation
PiPPy copied to clipboard

Pipeline Parallelism for PyTorch

Results 123 PiPPy issues
Sort by recently updated
recently updated
newest added

Just dumping issues here as I find them (applying PipelineStage to torchtrain) Stage 1. fwd_inputs all forced to have 'requires_grad=True' -- why? what's our design here? `freqs_cis` could be passed...

I noticed for many of my PRs after running `./format.sh`, it still does not pass the checks in `./check.sh`. This causes the PR to fail in the lint check in...

better engineering

Add try-except around the forward to also log the stage, shapes, etc. before reraising the exception. Look into which debug flags can be used to handle the hang cases. Document...

better engineering

Will need to update which group the batch_p2p ops are sent to and remove the current assumptions using rank+1 and rank-1.

enhancement

Loss function is currently not implemented: https://github.com/pytorch/PiPPy/blob/f2e605d045cdc64cac31e2dd99a01706eb638a16/pippy/PipelineSchedule.py#L68-L73 We should add the loss function as an argument into PipelineSchedule.step(). This also means that we should change the output of `forward()`: -...

enhancement

""" fwd_outputs all forced to have 'requires_grad=True' -- why? what's our design here? freqs_cis could be passed from stage0 to stage1 but is an input value from dataloader and should...

bug

## Current status Working ``` # PP = 2, TP = 4 $ torchrun --nproc-per-node 8 pippy_llama.py ['make', 'think', 'you', 'be', 'getting', 'great', 'favorite', 'right'] ['make', 'think', 'you', 'be', 'getting',...

cla signed

Hi, I get g\the below error whenever I try to create an optimizer. Please help optimizer = driver.instantiate_optimizer(torch.optim.Adam) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/nucleus/lib/python3.11/site-packages/torchpippy-0.1.1+8f549f3-py3.11.egg/pippy/PipelineDriver.py", line 1573, in instantiate_optimizer return PipelineOptimizer( ^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/envs/nucleus/lib/python3.11/site-packages/torchpippy-0.1.1+8f549f3-py3.11.egg/pippy/PipelineDriver.py",...