[Feature request] Support for florence models
This is my go to tool for lora training the thing missing is support for natural language models totrain flux models
I have plans to add Gemini thought the free API (with each user generating their own API Key) and I've found a project that implements Florence2 which I could learn from.
Awesome
Added Gemini captioning for the next release (v2.9.3), Florence2 is next; no ETA for Florence2 yet.
https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-two https://moondream.ai/playground
They are good. Support NSFW & SFW. Gemini and all other online APIs don't support NSFW
Already planned for Florence, just haven't had the time.
https://huggingface.co/mnemic/paligemma-longprompt-v1-safetensors https://huggingface.co/gokaygokay/paligemma-rich-captions
This seems like a combination of tags + long descriptive caption.
I've added support for Florence2 based on this repo: https://github.com/Particle1904/DatasetHelpers/commit/db21a7069922e06abe4d7d29d6d067f499b191fe
Both captioning and automatic text/watermark removal should be available in the next release (2.9.5). I finally found a reason to sit down and implement it (Florence2 has support for OCR with Regions, and I need to clean up watermarks from a large dataset). These new features are mostly feature complete but they aren't available in the user interface yet until 2.9.5.
2.9.5 is now available, which includes Florence2 captioning. I still think that Gemini is miles ahead of Florence. I have plans to add LlamaSharp in the future to make any .GGUF model available. I'll close this thread now. Please open a new one for any issues related to 2.9.5 or other releases.