LAVIS
LAVIS copied to clipboard
Finetuning VQA on BLIP2
Hi,
Can you add the VQA fine-tuning function of BLIP2?
In the paper, when you fine-tune the VQA task, you will fine-tune the image encoder. When I use the freeze_vit: False
command.
But I encountered issues with loss and model parameters becoming nan and inf.
Initial weight:
Gradient of the first step of the model:
After update the weight of model :
Can you help me analyze the reason? Thank you very much.
The lr is 1e-5
Hi,
Same problem, have you fixed?
update the vit from fp16 to fp32
@zhl98 Did you manage to fine-tune on VQA? Can you share code?
Excuse me, I am also working on finetuning VQA on BLIP2. In the paper, I find that the Prompt used for VQA is "Question: {} Answer:". I would like to ask if my understanding is correct: when training, we don't utilize the prompt and only use the original question input; when testing, we utilize the prompt to reformat the question input to get a better performance. I will appreciate it if you could kindly help. Thanks.
Excuse me, I am also working on finetuning VQA on BLIP2. In the paper, I find that the Prompt used for VQA is "Question: {} Answer:". I would like to ask if my understanding is correct: when training, we don't utilize the prompt and only use the original question input; when testing, we utilize the prompt to reformat the question input to get a better performance. I will appreciate it if you could kindly help. Thanks.
Can you share code?
hi, have you implemented fine-tune blip2 on the vqa task?
hey, are there any news about finetuning BLIP-2 on VQA task??