florence-2 topic
maestro
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
taggui
Tag manager and captioner for image datasets
Surveillance_Video_Summarizer
VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for que...
autodistill-grounded-sam-2
Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.
autodistill-florence-2
Use Florence 2 to auto-label data for use in training fine-tuned object detection models.
rem-wm
Watermark remover tool that leverages the capabilities of Microsoft Florence and Lama Cleaner models.
wd-llm-caption-cli
A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.
Vision-language-models-VLM
vision language models finetuning notebooks & use cases (Medgemma - paligemma - florence .....)
WatermarkRemover-AI
AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly P...
X-VLA
The offical Implementation of "Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model"