multi-modal topic
MDVC
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
awesome-visual-question-answering
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
DeepKE
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
erlexec
Represent, send, store and search multimodal data
docarray
Represent, send, store and search multimodal data
valhalla
Open Source Routing Engine for OpenStreetMap
DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
MedMNIST
[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification
nemar
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
Transformer-in-Vision
Recent Transformer-based CV and related works.