torchtune enable customize activation functions in clip vision encoder

Context

What is the purpose of this PR? Is it to

[x] add a new feature
[ ] fix a bug
[ ] update tests and/or documentation
[ ] other (please add here)

Please link to any issues this PR addresses.

Changelog

What are the changes made in this PR?

This PR enables users to customize activation functions in clip vision encoder

Test plan

Please make sure to do each of the following if applicable to your PR. (If you're not sure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.)

[ ] run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
[ ] add unit tests for any new functionality
[x] update docstrings for any new or updated methods or classes
[ ] run unit tests via pytest tests
[ ] run recipe tests via pytest tests -m integration_test
[x] manually run any new or modified recipes with sufficient proof of correctness
[ ] include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

UX

If your function changed a public API, please add a dummy example of what the user experience will look like when calling it. Example of docstring: https://github.com/pytorch/torchtune/blob/6a7951f1cdd0b56a9746ef5935106989415f50e3/torchtune/modules/vision_transformer.py#L285 Example in our docs: https://pytorch.org/torchtune/main/tutorials/qat_finetune.html#applying-qat-to-llama3-models

[ ] I did not change any public API;
[x] I have added an example to docs or docstrings;

Aug 21 '24 18:08 Gasoonjia

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1385

:page_facing_up: Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit e4a9cb5ce56f6de255fda8a4b35c0511845a99cf with merge base 9e65fa9ae67ce09e953cb68b5093d42cb6b31310 (): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Aug 21 '24 18:08 pytorch-bot[bot]

hey @Gasoonjia, you may be interested in our upcoming flamingo PR: https://github.com/pytorch/torchtune/pull/1357/files

feel free to ping me in workplace https://fb.workplace.com/profile.php?id=61556984579937

Aug 21 '24 18:08 felipemello1

Hey @RdoubleA thanks for comment! I've updated the PR and pls take a look! I'm working on leveraging torchtune's (tt) module to reproduce llava1.5 based on huggingface (hf) impl, and I realize that hf is using quickgelu instead of tt's default act (SiLU), so I'd like to have a way to customize the act func I use.

Aug 21 '24 19:08 Gasoonjia

reproduce llava1.5 based on huggingface

FYI, we had some work on llava transforms: https://github.com/pytorch/torchtune/pull/1057

Aug 21 '24 19:08 felipemello1

Hey @felipemello1 glad to see you in the github! Yes I'm keeping my eye on your wonderful PR, and definetly need your help when I try to work on Flamingo! I noticed you are updating CLIP module in that PR, will there be huge update on that? Especially on modelling side?

Aug 21 '24 19:08 Gasoonjia

reproduce llava1.5 based on huggingface

FYI, we had some work on llava transforms: #1057

Thanks for sharing! Will try to leverage your work!

Aug 21 '24 19:08 Gasoonjia

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 72.81%. Comparing base (9e65fa9) to head (e4a9cb5).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1385      +/-   ##
==========================================
+ Coverage   70.57%   72.81%   +2.24%     
==========================================
  Files         272      272              
  Lines       12895    12895              
==========================================
+ Hits         9101     9390     +289     
+ Misses       3794     3505     -289

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Aug 21 '24 20:08 codecov-commenter

torchtune torchtune copied to clipboard

enable customize activation functions in clip vision encoder

Context

Changelog

Test plan

UX

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1385

:white_check_mark: No Failures

Codecov Report

torchtune
torchtune copied to clipboard