multimodal-learning topic

List multimodal-learning repositories

LViT

259
Stars
24
Forks
Watchers

[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"

MSAF

67
Stars
9
Forks
Watchers

Offical implementation of paper "MSAF: Multimodal Split Attention Fusion"

VIG

21
Stars
3
Forks
Watchers

Dataset for Visually Indicated Sound Generation by Perceptually Optimized Classification

slp

21
Stars
7
Forks
Watchers

Utils and modules for Speech Language and Multimodal processing using pytorch and pytorch lightning

open_flamingo

3.5k
Stars
263
Forks
Watchers

An open-source framework for training large multimodal models.

OFASys

142
Stars
10
Forks
Watchers

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

UniRepLKNet

835
Stars
52
Forks
Watchers

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

ICCV-2023-Papers

919
Stars
42
Forks
Watchers

ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support...

AMOS

24
Stars
4
Forks
Watchers

AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation