llava topic
uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Sight-Beyond-Text
This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
taggui
Tag manager and captioner for image datasets
FindTheChatGPTer
ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平替们,包括文本大模型、多模态大模型等,为大家提供一些便利
multi_token
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
gpt-4v-distribution-shift
Code for "How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation"
restai
RestAI is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex, Ollama and HF Pipelines. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama. Pr...