transformers
transformers copied to clipboard
Add support for fine-tuning CLIP-like models using contrastive-image-text example
What does this PR do?
The example contrastive-image-text works for fine-tuning models that have the model_type
"clip", but for other models like "chinese_clip" and "siglip" the VisionTextDualEncoderConfig
class is too specific to CLIP models.
This PR adds support for Chinese-CLIP and SigLIP vision models to be fine-tuned with the contrastive-image-text example.
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
@amyeroberts @patil-suraj @patrickvonplaten
Fixing up this PR as per the contributor guidelines now
Happy to receive suggestions for any test candidates
This has been manually tested by replacing openai/clip-vit-base-patch32
in the contrastive-image-text example with the following models:
OFA-Sys/chinese-clip-vit-base-patch16
facebook/metaclip-b32-400m
google/siglip-so400m-patch14-384
laion/CLIP-ViT-B-32-laion2B-s34B-b79K
laion/CLIP-ViT-H-14-laion2B-s32B-b79K
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
openai/clip-vit-base-patch32
openai/clip-vit-large-patch14
openai/clip-vit-large-patch14-336
timm/ViT-SO400M-14-SigLIP-384
Not sure what's going on here: https://app.circleci.com/pipelines/github/huggingface/transformers/84689/workflows/02d18e8c-af6e-465d-8625-fb3dc53bc03e/jobs/1095368/parallel-runs/0/steps/0-116 https://app.circleci.com/pipelines/github/huggingface/transformers/84689/workflows/02d18e8c-af6e-465d-8625-fb3dc53bc03e/jobs/1095369/parallel-runs/0/steps/0-115 https://app.circleci.com/pipelines/github/huggingface/transformers/84689/workflows/02d18e8c-af6e-465d-8625-fb3dc53bc03e/jobs/1095365/parallel-runs/0/steps/0-117
Hi @tjs-intel, thanks for adding this! For the failing tests, could you try rebasing onto main? There was some recent issues we had with compatible library versions which should now be resolved
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.