Multi-Modal-Transformer icon indicating copy to clipboard operation
Multi-Modal-Transformer copied to clipboard

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and self-supervised l...

Reading list in Transformer

This repo is aimed to collect all the recent popular Transformer paper, codes and learning resources with respect to the domains of Vision Transformer, NLP and multi-modal, etc.

Topics (paper and code)

  • Image Transformer

  • Video Transformer

  • Video & Language & other modality Transformer

  • Image & language & other modlity Trasformer

  • Natural Language Processing Transformer

  • Efficient Transformer

  • model compression

  • Self Supverpervised Learning in Vision

  • other interested papers in related domains

Review Paper in multi-modal

  • Video-language

Tutorials and workshop

Datasets

  • Multi-modal Datasets

Blogs

Tools