multimodal-pre-trained-model topic

List multimodal-pre-trained-model repositories

donut

5.4k
Stars
432
Forks
44
Watchers

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

unilm

18.7k
Stars
2.4k
Forks
300
Watchers

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

LiLT

326
Stars
39
Forks
Watchers

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

Multimodality-Representation-Learning

61
Stars
6
Forks
Watchers

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl....