vision-and-language topic
awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
ViLT
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
DL-NLP-Readings
My Reading Lists of Deep Learning and Natural Language Processing
UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
conceptual-12m
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.
awesome-vision-and-language
A curated list of awesome vision and language resources (still under construction... stay tuned!)
Awesome-Computer-Vision
Awesome Resources for Advanced Computer Vision Topics
TCL
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022