multimodal-datasets topic
FVTA_MemexQA
Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19
CMU-MultimodalSDK
CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.
Multimodal-short-video-dataset-and-baseline-classification-model
500,000 multimodal short video data and baseline models. 50万条多模态短视频数据集和基线模型(TensorFlow2.0)。
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
MFT
Pytorch implementation of Multimodal Fusion Transformer for Remote Sensing Image Classification.
eipy
Ensemble Integration: a customizable pipeline for generating multi-modal, heterogeneous ensembles
Awesome-Multi-Modal-Dialog
[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics
Multimodality-Representation-Learning
This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....
VQASynth
Compose multimodal datasets 🎹
Multimodal-datasets
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information...