gpt-4-vision topic
LibreChat
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message searc...
lobe-chat
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 4 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management...
sgpt
SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.
openai-vision-api-for-videos
Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦
maestro
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
sports
Cool experiments at the intersection of Computer Vision and Sports ⚽🏃
ViP-LLaVA
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
py-gpt
Desktop AI Assistant powered by GPT-4, GPT-4 Vision, GPT-3.5, Gemini, Claude, Llama 3, DALL-E, Langchain, Llama-index, chat, vision, voice control, image generation and analysis, agents, code/command...
SirChatalot
SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities, tool...