LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

BLIP2 VQA finetune

Open evelinehong opened this issue 2 years ago • 4 comments
trafficstars

Hi we are working on finetuning VQA with BLIP2. Any instructions on how to modify the codes? When will the finetuning codes be released?

evelinehong avatar Mar 20 '23 20:03 evelinehong

Same question. I am wondering if there is any updates on this. Thanks!

xliucs avatar Apr 06 '23 06:04 xliucs

The question is similar: https://github.com/salesforce/LAVIS/issues/125

fmdmm avatar Apr 14 '23 02:04 fmdmm

Excuse me, I am also working on finetuning VQA on BLIP2. In the paper, I find that the Prompt used for VQA is "Question: {} Answer:". I would like to ask if my understanding is correct: when training, we don't utilize the prompt and only use the original question input; when testing, we utilize the prompt to reformat the question input to get a better performance. I will appreciate it if you could kindly help. Thanks.

qwqwq1445 avatar Dec 22 '23 02:12 qwqwq1445

Excuse me, I am also working on finetuning VQA on BLIP2. In the paper, I find that the Prompt used for VQA is "Question: {} Answer:". I would like to ask if my understanding is correct: when training, we don't utilize the prompt and only use the original question input; when testing, we utilize the prompt to reformat the question input to get a better performance. I will appreciate it if you could kindly help. Thanks.

Hi, I have the same question. According to the forword() function in blip2_t5.py, it seems like prompts are not used during training. But I'm wondering that shouldn't we use the same format during training and evaluating? Did you figure it out? Thanks!

Hurwitzzz avatar Mar 28 '24 10:03 Hurwitzzz