torchtune
torchtune copied to clipboard
enable customize activation functions in clip vision encoder
Context
What is the purpose of this PR? Is it to
- [x] add a new feature
- [ ] fix a bug
- [ ] update tests and/or documentation
- [ ] other (please add here)
Please link to any issues this PR addresses.
Changelog
What are the changes made in this PR?
This PR enables users to customize activation functions in clip vision encoder
Test plan
Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.)
- [ ] run pre-commit hooks and linters (make sure you've first installed via
pre-commit install) - [ ] add unit tests for any new functionality
- [x] update docstrings for any new or updated methods or classes
- [ ] run unit tests via
pytest tests - [ ] run recipe tests via
pytest tests -m integration_test - [x] manually run any new or modified recipes with sufficient proof of correctness
- [ ] include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)
UX
If your function changed a public API, please add a dummy example of what the user experience will look like when calling it. Example of docstring: https://github.com/pytorch/torchtune/blob/6a7951f1cdd0b56a9746ef5935106989415f50e3/torchtune/modules/vision_transformer.py#L285 Example in our docs: https://pytorch.org/torchtune/main/tutorials/qat_finetune.html#applying-qat-to-llama3-models
- [ ] I did not change any public API;
- [x] I have added an example to docs or docstrings;
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1385
- :page_facing_up: Preview Python docs built from this PR
Note: Links to docs will display an error until the docs builds have been completed.
:white_check_mark: No Failures
As of commit e4a9cb5ce56f6de255fda8a4b35c0511845a99cf with merge base 9e65fa9ae67ce09e953cb68b5093d42cb6b31310 ():
:green_heart: Looks good so far! There are no failures yet. :green_heart:
This comment was automatically generated by Dr. CI and updates every 15 minutes.
hey @Gasoonjia, you may be interested in our upcoming flamingo PR: https://github.com/pytorch/torchtune/pull/1357/files
feel free to ping me in workplace https://fb.workplace.com/profile.php?id=61556984579937
Hey @RdoubleA thanks for comment! I've updated the PR and pls take a look! I'm working on leveraging torchtune's (tt) module to reproduce llava1.5 based on huggingface (hf) impl, and I realize that hf is using quickgelu instead of tt's default act (SiLU), so I'd like to have a way to customize the act func I use.
reproduce llava1.5 based on huggingface
FYI, we had some work on llava transforms: https://github.com/pytorch/torchtune/pull/1057
Hey @felipemello1 glad to see you in the github! Yes I'm keeping my eye on your wonderful PR, and definetly need your help when I try to work on Flamingo! I noticed you are updating CLIP module in that PR, will there be huge update on that? Especially on modelling side?
reproduce llava1.5 based on huggingface
FYI, we had some work on llava transforms: #1057
Thanks for sharing! Will try to leverage your work!
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 72.81%. Comparing base (
9e65fa9) to head (e4a9cb5).
Additional details and impacted files
@@ Coverage Diff @@
## main #1385 +/- ##
==========================================
+ Coverage 70.57% 72.81% +2.24%
==========================================
Files 272 272
Lines 12895 12895
==========================================
+ Hits 9101 9390 +289
+ Misses 3794 3505 -289
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.