clip-interrogator BLIP2 support?

Feb 13 '23 15:02 xzuyn

Supporting this ^

Feb 21 '23 12:02 federicotorrielli

Hi @federicotorrielli ! Can you say what I have to adjust so that BLIP2 is supported? Just changing something od

BLIP_MODELS = {
    'base': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth',
    'large': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth'
}

in the clip_interrogator.py isn't enough I'm afraid!?

Mar 08 '23 02:03 Marcophono2

Hi @federicotorrielli ! Can you say what I have to adjust so that BLIP2 is supported? Just changing something od
BLIP_MODELS = {

    'base': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth',

    'large': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth'

}
in the clip_interrogator.py isn't enough I'm afraid!?

I will have a look into the code as soon as I have a bit of time. For the sake of simplicity it's best for all of us to re-do the code using LAVIS's implementation of BLIP2.

Mar 08 '23 07:03 federicotorrielli

The latest 0.6.0 version supports BLIP-2!

On the Config object set caption_model_name to blip2-2.7b or blip2-flan-t5-xl Also see run_gradio.py for example using them.

Mar 20 '23 04:03 pharmapsychotic

Make it possible to save in one csv file! image_name1.png; caption1; caption2; caption3. image_name2.png; caption1; caption2; caption3. image_name3.png; caption1; caption2; caption3. THX!

Apr 04 '23 02:04 oaefou