multimodal-large-language-models topic
modelscope-agent
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Awesome-Multimodal-Papers
A curated list of awesome Multimodal studies.
MileBench
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
FreeVA
FreeVA: Offline MLLM as Training-Free Video Assistant
SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
DenseFusion
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
GAMA
Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities
OceanGPT
[ACL 2024] OceanGPT: A Large Language Model for Ocean Science Tasks