vision-language-pretraining topics

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for...

mbzuai-oryx

chatbot

clip

gpt-4

llama

FLM

31

Stars

2

Forks

Watchers

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)

TencentARC

language-modeling

vision-language-pretraining

SegCLIP

78

Stars

8

Forks

Watchers

PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"

ArrowLuo

contrastive-learning

open-vocabulary

open-vocabulary-semantic-segmentation

semantic-segmentation

svl_adapter

19

Stars

3

Forks

Watchers

SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models

omipan

self-supervised-learning

vision-language-pretraining

COSA

38

Stars

2

Forks

Watchers

Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

TXH-mercury

video-captioning

video-language-pretrainng

video-qa

video-retrieval