Awesome-Multimodal-Large-Language-Models
Awesome-Multimodal-Large-Language-Models copied to clipboard
Request to add new paper
Hello! Thanks for compiling so many great methods into this very helpful resource. Our new paper (accepted by AAAI 2025) is a multimodal method for document image understanding. Would you mind adding it to your resource? Thanks!
Title: DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming Paper: https://arxiv.org/abs/2406.19101 Code: https://github.com/ZZZHANG-jx/DocKylin
We will cite your work in our camera-ready version.