Multi-Modal-Transformer
Multi-Modal-Transformer copied to clipboard

→

Metadata

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised l...

Readme
Issues

Reading list in Transformer

This repo is aimed to collect all the recent popular Transformer paper, codes and learning resources with respect to the domains of Vision Transformer, NLP and multi-modal, etc.

Topics (paper and code)

Image Transformer
Video Transformer
Video & Language & other modality Transformer
Image & language & other modlity Trasformer
Natural Language Processing Transformer
Efficient Transformer
model compression
Self Supverpervised Learning in Vision

other interested papers in related domains

Review Paper in multi-modal

Video-language

Tutorials and workshop

Datasets

Multi-modal Datasets

Blogs

Lil's blogs

Tools

PyTorchVideo a deep learning library for video understanding research
horovod a tool for multi-gpu parallel processing
accelerate an easy API for mixed precision and any kind of distributed computing
hyperparameter search: optuna
AI Conference Deadlines

About

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised l...

language

vision-transformer

multi-modal

video-language

video-transformer

mlp-mixer

efficiency-transformer

image-transformer

multi-modal-cvpr2021

transformer-readling-list

211

Stars

29

Forks

Watchers

Owner

← Metadata

211

Stars

29

Forks

Watchers

Owner

Metadata

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised l...