BLIP2 support?
Supporting this ^
Hi @federicotorrielli ! Can you say what I have to adjust so that BLIP2 is supported? Just changing something od
BLIP_MODELS = {
'base': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth',
'large': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth'
}
in the clip_interrogator.py isn't enough I'm afraid!?
Hi @federicotorrielli ! Can you say what I have to adjust so that BLIP2 is supported? Just changing something od
BLIP_MODELS = { 'base': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth', 'large': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth' }in the clip_interrogator.py isn't enough I'm afraid!?
I will have a look into the code as soon as I have a bit of time. For the sake of simplicity it's best for all of us to re-do the code using LAVIS's implementation of BLIP2.
The latest 0.6.0 version supports BLIP-2!
On the Config object set caption_model_name to blip2-2.7b or blip2-flan-t5-xl Also see run_gradio.py for example using them.
Make it possible to save in one csv file! image_name1.png; caption1; caption2; caption3. image_name2.png; caption1; caption2; caption3. image_name3.png; caption1; caption2; caption3. THX!