vision-and-language-pre-training topic
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insigh...
awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
awesome-vision-and-language-pretraining
A curated list of vision-and-language pre-training (VLP). :-)
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
SIC-CADS
Code Implementation of "Simple Image-level Classification Improves Open-vocabulary Object Detection" (AAAI'24)