stable-diffusion-webui
stable-diffusion-webui copied to clipboard
[Feature Request]: Integrate Blip 2 + Q&A
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What would your feature do ?
Use Blip 2 in the preprocess tab
Proposed workflow
- Integrate Blip 2 into the Preprocess Images tab under Train.
- User can ask one or more questions against image[i]
** wrap user question with helping prompt - "Question: {
USER_PROMPT
} Answer:`} - Future feature, user can pre prompt chats. ** Q1: Where is the OBJECT? Q2: What time of day is it? Q3..
Additional information
https://huggingface.co/spaces/Salesforce/BLIP2
the problem might be that blip2 is so large and v-ram demanding...
BLIP-2 worked for me when using 32 GB or more of CPU RAM. 16 GB was not enough and kept crashing when loading checkpoint shards.