vision-language-pretraining topic
awesome-japanese-llm
日本語LLMまとめ - Overview of Japanese LLMs
PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high per...
FLAIR
FLAIR: A Foundation LAnguage-Image model of the Retina for fundus image understanding.
DeCLIP
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
ptp
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
b2t
Bias-to-Text: Debiasing Unknown Visual Biases through Language Interpretation
Multimodality-Representation-Learning
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
BLIText
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
VLMixer
VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix (ICML 2022)