visual-language-models topic

List visual-language-models repositories

ROSGPT_Vision

85
Stars
12
Forks
Watchers

Commanding robots using only Language Models' prompts

CogVLM

5.9k
Stars
407
Forks
Watchers

a state-of-the-art-level open visual language model | 多模态预训练模型

AlignGPT

29
Stars
3
Forks
Watchers

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

crab

177
Stars
26
Forks
Watchers

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/

VCR

23
Stars
1
Forks
Watchers

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

CoN-CLIP

17
Stars
1
Forks
Watchers

Implementation of the "Learn No to Say Yes Better" paper.

HOI-Ref

17
Stars
2
Forks
Watchers

Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"

wildclip

15
Stars
1
Forks
Watchers

Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models