image-text-retrieval topics

BLIP

4.3k

Stars

573

Forks

Watchers

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

salesforce

image-captioning

image-text-retrieval

vision-and-language-pre-training

vision-language

rosita

55

Stars

13

Forks

Watchers

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

MILVLG

image-text-retrieval

pre-training

referring-expression-comprehension

vision-and-language

Chinese-CLIP

3.8k

Stars

404

Forks

Watchers

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

OFA-Sys

chinese

clip

computer-vision

image-text-retrieval

PicQuery

374

Stars

41

Forks

Watchers

🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)

greyovo

android

clip

image-text-retrieval

image-text-search

Text2Poster-ICASSP-22

203

Stars

16

Forks

Watchers

Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"

chuhaojin

aigc

artificial-neural-networks

banner-advertisements

banner-generator

UPop

90

Stars

7

Forks

Watchers

[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

sdc17

efficient-deep-learning

framework

image-captioning

image-text-retrieval

image-captioning

37

Stars

7

Forks

Watchers

Image captioning using python and BLIP

cobanov

blip

image-captioning

image-text-retrieval

img2text

CrossGET

24

Stars

0

Forks

Watchers

[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.

sdc17

efficient-deep-learning

framework

image-captioning

image-text-retrieval

InternVL

5.8k

Stars

456

Forks

Watchers

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

OpenGVLab

image-classification

image-text-retrieval

llm

mme

CPL

31

Stars

4

Forks

Watchers

Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"

eric-ai-lab

causal-inference

counterfactual-reasoning

image-classification

image-text-retrieval