gpt-4-vision topic

List gpt-4-vision repositories

LibreChat

27.5k
Stars
4.9k
Forks
Watchers

Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message searc...

lobe-chat

61.8k
Stars
12.9k
Forks
Watchers

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 4 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management...

gptty

48
Stars
7
Forks
Watchers

ChatGPT wrapper in your TTY

sgpt

398
Stars
33
Forks
398
Watchers

SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.

openai-vision-api-for-videos

61
Stars
9
Forks
Watchers

Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦

maestro

2.6k
Stars
219
Forks
2.6k
Watchers

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

sports

475
Stars
32
Forks
Watchers

Cool experiments at the intersection of Computer Vision and Sports ⚽🏃

ViP-LLaVA

292
Stars
22
Forks
Watchers

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

py-gpt

536
Stars
105
Forks
Watchers

Desktop AI Assistant powered by GPT-4, GPT-4 Vision, GPT-3.5, Gemini, Claude, Llama 3, DALL-E, Langchain, Llama-index, chat, vision, voice control, image generation and analysis, agents, code/command...

SirChatalot

72
Stars
14
Forks
72
Watchers

SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities, tool...