Ke Wen
Ke Wen
Dear developers, can you please help with the following errors please? Thank you! ``` $ git clone https://github.com/google/nccl-fastsocket.git $ cd nccl-fastsocket $ bazel build :all WARNING: Output base '/home/user/.cache/bazel/_bazel_user/1340a46a9e7502c5cf03e1a0a087e4f3' is...
Purpose of this PR is to show: 1. One line change needed -- remove this line: ``` self.freqs_cis = self.freqs_cis.to(h.device) ``` Reason 1: compile does not support in-place attribute mutation....
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #125449 * #125448 * #125273 cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj...
`torch.export` has strict mode and non-strict mode. For difference, please read [Non-Strict Export](https://pytorch.org/docs/stable/export.html#non-strict-export). This PR switches to non-strict mode by default. Improving tracing success rate (no Dynamo graph break).
Currently every test defines its own example model. We should have a model registry to deduplicate those models, and the tests just fetch from it.
Test case: ``` torchrun --nproc-per-node 4 test_fwd.py ``` Reason:  When stage 0 finishes computation and hit batch_send, all corresponding comm’s from other ranks...
 Need to investigate if this is a test issue or pippy issue or general pytorch issue.
## Current status Working ``` # PP = 2, TP = 4 $ torchrun --nproc-per-node 8 pippy_llama.py ['make', 'think', 'you', 'be', 'getting', 'great', 'favorite', 'right'] ['make', 'think', 'you', 'be', 'getting',...