vision-language topic
hulc2
[ICRA2023] Grounding Language with Visual Affordances over Unstructured Data
image-captioning
Image captioning using python and BLIP
multimodal-meta-learn
Official code repository for "Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning" (published at ICLR 2023).
Visual-Chinese-LLaMA-Alpaca
多模态中文LLaMA&Alpaca大语言模型(VisualCLA)
Sight-Beyond-Text
This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
NTU-2022Fall-DLCV
Deep Learning for Computer Vision 深度學習於電腦視覺 by Frank Wang 王鈺強
daclip-uir
[ICLR 2024] Controlling Vision-Language Models for Universal Image Restoration. 5th place in the NTIRE 2024 Restore Any Image Model in the Wild Challenge.
Awesome-Vision-Language-Finetune
Awesome List of Vision Language Prompt Papers
SciGraphQA
SciGraphQA: Large-Scale Synthetic Multi-Turn Question-Answering Dataset for Scientific Graphs
NuScenes-QA
[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.