LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

LAVIS - A One-stop Library for Language-Vision Intelligence

Results 282 LAVIS issues
Sort by recently updated
recently updated
newest added

Hi, I used BlipForConditionalGeneration from transformers for image captioning. I want to visualize the reason of generated caption (word by word) like GradCAM. I found a code from Albef (https://github.com/salesforce/ALBEF/blob/main/visualization.ipynb),...

Thank you so much for the code! It is pretty useful! Could you please also open source the retrieval training based on BLIP2? Any help is greatly appreciated.

Hello, I appreciate the work you've done. I would like to ask you a question about how to interpret the image text retrieval score. I received a score like this:...

In my understanding, VQA is similar with the ability of zero-shot image-to-text generation mentioned in the BLIP2 paper. They all give the answer about prompt(question / natural language instructions) conditioned...

Did 仮�� lots Sep beside „香�陈langle lots curios Profilelangle lots Sep beside „香�陈langle lots curios Profilelangle lots Sep beside „香�陈langle lots curios Profilelangle lots Sep beside „获ensuremath

Thank you very much for your open source contribution, the performance of the model is amazing. If I want to obtain image features from the intermediate layers of the backbone,...

hi, thanks for open source, great work. Do you have examples to pre-train blip2 using my own data ?

Do you have a train config for blip2 vicuna instruct? Currently, using a vqa dataset with "blip_question" text processors and a vqa task, I encounter an error at this line...

Are these model supported by nvidia `Quadro RTX 5000`?