examples
examples copied to clipboard
FSDP example
Deploy Preview for pytorch-examples-preview canceled.
| Name | Link |
|---|---|
| Latest commit | c15c6897066d5a42fff122baac9a49f8a1b87aad |
| Latest deploy log | https://app.netlify.com/sites/pytorch-examples-preview/deploys/62c741fbf9c2cc00089990df |
@rohan-varma @lessw2020 @HamidShojanazeri once you tell @hudeven and I that you'd like to merge the PR let us know. This has been open for a while. Feel free to close any feedback you don't believe is relevant
Let me review - I was not even aware this PR existed until today, so thanks for the direct link.
General comment - this example does not use activation checkpointing due to the timing of this PR (it wasn't added in FSDP until after this PR).
But I think it would be good to update this example with it, to make sure it's present as activation checkpointing is one of our biggest throughput boosters.
Deploy Preview for pytorch-examples-preview canceled.
| Name | Link |
|---|---|
| Latest commit | f62b4aec7bff832fc65b59aa62d60e94ddd6b39e |
| Latest deploy log | https://app.netlify.com/sites/pytorch-examples-preview/deploys/646e47182d49400008c6a694 |
@msaroufim , @hudeven sorry for the delay I addressed the comments and made the code more modular, would be great if we could merge this.
General comment - this example does not use activation checkpointing due to the timing of this PR (it wasn't added in FSDP until after this PR). But I think it would be good to update this example with it, to make sure it's present as activation checkpointing is one of our biggest throughput boosters.
Added the AC and rate_lmiter as well+ model checkpointings.
@svekars any idea if the doc build is flaking for any reason?
@HamidShojanazeri do you mind rebasing on main to see if the error goes away