Andrew Gu

Results 32 issues of Andrew Gu

With FSDP2 and transformer block compile, `torch.compile` saves both the SDPA output and the contiguous transposed tensor for backward: https://github.com/pytorch/torchtitan/blob/7e93822e402c3f470bb7ddb925bbc43701bf8573/torchtitan/models/llama/model.py#L210-L213 However, with simpleFSDP with full model compile, `torch.compile` only saves...

## Work Items * Meta-device initialization / `_apply()` methods - [x] Support initial meta-device initialization using `swap_tensors` path - [ ] Remove manual padding logic after https://github.com/pytorch/pytorch/issues/113045 @wz337 - **Outcome:**...

triaged
module: fsdp