multimodality topic
UniVL
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
clip-guided-diffusion
A CLI tool/python module for generating images from text using guided diffusion and CLIP from OpenAI.
X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
nmtpytorch
Sequence-to-Sequence Framework in PyTorch
FEDOT
Automated modeling and machine learning framework FEDOT
big-sleep
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
multimodal-sentiment-analysis
Attention-based multimodal fusion for sentiment analysis
how2-dataset
This repository contains code and metadata of How2 dataset
fonduer
A knowledge base construction engine for richly formatted data