multi-modal-learning topic

List multi-modal-learning repositories

hcaptcha-challenger

1.4k
Stars
256
Forks
Watchers

🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.

japanese-clip

64
Stars
6
Forks
Watchers

Japanese CLIP by rinna Co., Ltd.

TRAR-VQA

61
Stars
17
Forks
Watchers

[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

Multi-model analysis of sentiment and emotion in multi-speaker conversations.

SAM-SLR-v2

30
Stars
8
Forks
Watchers

SAM-SLR-v2 is an improved version of SAM-SLR for sign language recognition.

WSS-CMER

20
Stars
3
Forks
Watchers

Code for the paper : "Weakly supervised segmentation with cross-modality equivariant constraints", available at https://arxiv.org/pdf/2104.02488.pdf

prismer

1.3k
Stars
74
Forks
9
Watchers

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

Macaw-LLM

1.4k
Stars
109
Forks
20
Watchers

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

CVPR-2023-24-Papers

282
Stars
19
Forks
Watchers

CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included...

Achelous

132
Stars
7
Forks
Watchers

Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar