multi-modal-learning topic
HyperDenseNet_pytorch
Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation
x-clip
A concise but complete implementation of CLIP with various experimental improvements from recent papers
open_clip
An open source implementation of CLIP.
awesome-visual-question-answering
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
nemar
[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
awesome-vision-and-language-pretraining
A curated list of vision-and-language pre-training (VLP). :-)
Deep-Learning-Framework-for-Multi-modal-Product-Classification
Code repository for Rakuten Data Challenge: Multimodal Product Classification and Retrieval.
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
NeuralMerger
Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelli...
Multimodal-Remote-Sensing-Toolkit
A python tool to perform deep learning experiments on multimodal remote sensing data.