cross-modal topic
distill-bev
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)
BioT5
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations (EMNLP 2023)
DUET
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
SAKDN
[IEEE T-IP 2021] Semantics-aware Adaptive Knowledge Distillation for Cross-modal Action Recognition
speech-to-image-translation-without-text
Code for paper "direct speech-to-image translation"
Structure-CLIP
[Paper][AAAI 2024] Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations
CMG
The official implementation of Achieving Cross Modal Generalization with Multimodal Unified Representation (NeurIPS '23)
UniPT
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
cross-modal-hasing-playground
Python implementation of cross-modal hashing algorithms
Multimodality-Representation-Learning
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....