LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

Finetuning VQA on BLIP2

Open zhl98 opened this issue 1 year ago • 8 comments

Hi, Can you add the VQA fine-tuning function of BLIP2? In the paper, when you fine-tune the VQA task, you will fine-tune the image encoder. When I use the freeze_vit: False command. But I encountered issues with loss and model parameters becoming nan and inf. Initial weight: image

Gradient of the first step of the model: image

After update the weight of model : image

Can you help me analyze the reason? Thank you very much.

zhl98 avatar Jul 07 '23 08:07 zhl98

The lr is 1e-5

zhl98 avatar Jul 07 '23 08:07 zhl98

Hi,

Same problem, have you fixed?

simplelifetime avatar Sep 02 '23 04:09 simplelifetime

update the vit from fp16 to fp32

zhl98 avatar Sep 02 '23 04:09 zhl98

@zhl98 Did you manage to fine-tune on VQA? Can you share code?

BrianG13 avatar Oct 16 '23 15:10 BrianG13

Excuse me, I am also working on finetuning VQA on BLIP2. In the paper, I find that the Prompt used for VQA is "Question: {} Answer:". I would like to ask if my understanding is correct: when training, we don't utilize the prompt and only use the original question input; when testing, we utilize the prompt to reformat the question input to get a better performance. I will appreciate it if you could kindly help. Thanks.

qwqwq1445 avatar Dec 22 '23 02:12 qwqwq1445

Excuse me, I am also working on finetuning VQA on BLIP2. In the paper, I find that the Prompt used for VQA is "Question: {} Answer:". I would like to ask if my understanding is correct: when training, we don't utilize the prompt and only use the original question input; when testing, we utilize the prompt to reformat the question input to get a better performance. I will appreciate it if you could kindly help. Thanks.

Can you share code?

shams2023 avatar Mar 22 '24 09:03 shams2023

hi, have you implemented fine-tune blip2 on the vqa task?

WildLight avatar Apr 27 '24 11:04 WildLight

hey, are there any news about finetuning BLIP-2 on VQA task??

salvatoregrimaUni avatar Jul 09 '24 17:07 salvatoregrimaUni