vision-language topic

List vision-language repositories

awesome-japanese-llm

1.3k

Stars

38

Forks

1.3k

Watchers

日本語LLMまとめ - Overview of Japanese LLMs

DriveLM

821

Stars

52

Forks

Watchers

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

autonomous-driving

chain-of-thought

graph-of-thoughts

large-language-models

VLTinT

64

Stars

6

Forks

Watchers

[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning

transformer-architecture

video-captioning

awesome-video-text-datasets

24

Stars

2

Forks

Watchers

A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.

video-captioning

video-description

ONE-PEACE

942

Stars

59

Forks

Watchers

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

contrastive-loss

foundation-models

AlphaCLIP

651

Stars

38

Forks

Watchers

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

machine-learning

vision-and-language

vision-language

Awesome-Long-Context

26

Stars

1

Forks

Watchers

A curated list of resources about long-context in large-language models and video understanding.

large-language-models

video-understanding

vision-language

Proto-CLIP

33

Stars

4

Forks

Watchers

Code release for Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

few-shot-classifcation

few-shot-learning

multimodal-learning

prototype-learning

SEED

563

Stars

31

Forks

Watchers

Official implementation of SEED-LLaMA (ICLR 2024).

foundation-model

vision-language

vision-language-models-are-bows

235

Stars

15

Forks

Watchers

Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023

compositionality