composer
composer copied to clipboard
Adding filtering for optimizer parameters to only trainable parameters
trafficstars
What does this PR do?
After FSDP-ing the model, the optimizer should only contain parameters that require gradients (non-frozen parameters). This change is nice for LORA
What issue(s) does this change relate to?
https://mosaicml.atlassian.net/browse/CO-2221
Before submitting
- [ ] Have you read the contributor guidelines?
- [ ] Is this change a documentation change or typo fix? If so, skip the rest of this checklist.
- [ ] Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
- [ ] Did you update any related docs and document your change?
- [ ] Did you update any related tests and add any new tests related to your change? (see testing)
- [ ] Did you run the tests locally to make sure they pass?
- [x ] Did you run
pre-commiton your change? (see thepre-commitsection of prerequisites)
looks good! will make fsdp lighter for models with some frozen weights
Closing as out of date