vision-language topic

List vision-language repositories

awesome-japanese-llm

1.3k
Stars
38
Forks
1.3k
Watchers

日本語LLMまとめ - Overview of Japanese LLMs

DriveLM

821
Stars
52
Forks
Watchers

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

VLTinT

64
Stars
6
Forks
Watchers

[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning

awesome-video-text-datasets

24
Stars
2
Forks
Watchers

A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.

ONE-PEACE

942
Stars
59
Forks
Watchers

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

AlphaCLIP

651
Stars
38
Forks
Watchers

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Awesome-Long-Context

26
Stars
1
Forks
Watchers

A curated list of resources about long-context in large-language models and video understanding.

Proto-CLIP

33
Stars
4
Forks
Watchers

Code release for Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

SEED

563
Stars
31
Forks
Watchers

Official implementation of SEED-LLaMA (ICLR 2024).

vision-language-models-are-bows

235
Stars
15
Forks
Watchers

Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023