LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

How to use your own dataset to train and fine-tune the VQA task of BLIP2-flant5xl

Open xcxhy opened this issue 2 years ago • 24 comments

Hi, thank you very much for open source. I want to use my own Image and caption, and QA data to fine-tune the BLIP2 data. Should my process be to prepare the same data set for okvaq, and then run the /run_scripts/blip2/eval/eval_okvqa_zeroshot_flant5xl.sh file? Then should I copy evaluate.py into the run_scripts/blip2/eval/ path? Or is my approach wrong?

xcxhy avatar Feb 23 '23 15:02 xcxhy

Hi, eval_okvqa_zeroshot_flant5xl.sh provides the script for evaluation. You can refer to train_caption_coco.sh for fine-tuning on image captioning. We are still working on providing support for VQA fine-tuning.

Thanks.

LiJunnan1992 avatar Feb 27 '23 07:02 LiJunnan1992

Thank you very much for your comments. However, it appears that the codes are designed for fine-tuning on the COCO dataset rather than a custom dataset. I was wondering if it would be possible to make modifications to the code in order to fine-tune the model on our custom dataset by registering it in the 'builders' directory?

chenyd0763 avatar Mar 04 '23 23:03 chenyd0763

@chenyd0763, can you take a look at our tutorial on how to add new datasets? https://opensource.salesforce.com/LAVIS//latest/tutorial.datasets.html

dxli94 avatar Mar 06 '23 14:03 dxli94

Hi, eval_okvqa_zeroshot_flant5xl.sh provides the script for evaluation. You can refer to train_caption_coco.sh for fine-tuning on image captioning. We are still working on providing support for VQA fine-tuning.

Thanks.

looking forward to the training and finetuning code

dongrixinyu avatar Mar 14 '23 06:03 dongrixinyu

Thank to your response, I will try it later.

@.***

 

------------------ 原始邮件 ------------------ 发件人: "salesforce/LAVIS" @.>; 发送时间: 2023年3月14日(星期二) 下午2:16 @.>; @.@.>; 主题: Re: [salesforce/LAVIS] How to use your own dataset to train and fine-tune the VQA task of BLIP2-flant5xl (Issue #152)

Hi, eval_okvqa_zeroshot_flant5xl.sh provides the script for evaluation. You can refer to train_caption_coco.sh for fine-tuning on image captioning. We are still working on providing support for VQA fine-tuning.

Thanks.

looking forward to the training and finetuning code

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

xcxhy avatar Mar 14 '23 06:03 xcxhy

Also looking forward to the training and fine-tuning code.

xliucs avatar Apr 06 '23 06:04 xliucs

Is fine-tuning code available?

mayada24 avatar Apr 06 '23 23:04 mayada24

looking forward to the training and finetuning code

dreamlychina avatar Apr 19 '23 09:04 dreamlychina

looking forward to the training and finetuning code

matthewdm0816 avatar May 07 '23 23:05 matthewdm0816

Looking forward to the finetuning code for VQA, think it could lead to some very interesting applications :)

AbhinavGopal avatar May 10 '23 17:05 AbhinavGopal

looking forward to the fine-tuning code for VQA as well.

arcb01 avatar May 12 '23 11:05 arcb01

looking forward to the fine-tuning code for VQA +1

edchengg avatar May 31 '23 18:05 edchengg

Also looking forward to the fine-tuning support. Is it here yet? :)

robertjoellewis avatar Jul 20 '23 04:07 robertjoellewis

Also looking forward to the fine-tuning support!

essamsleiman avatar Jul 21 '23 22:07 essamsleiman

Also looking forward to the fine-tuning code on VQA!

qwqwq1445 avatar Aug 08 '23 03:08 qwqwq1445

VQA的finetune还没出来吗

nkjulia avatar Aug 09 '23 02:08 nkjulia

Also looking forward to the fine-tuning code for VQA!

NWalker4483 avatar Aug 11 '23 17:08 NWalker4483

Looking forward to fine-tuning for VQA!

weizhouc avatar Aug 18 '23 23:08 weizhouc

looking forward to fine tuning for vqa. at this point just captioning and running llm of choice, but obviously will be awesome if vqa can be fine tuned directly

lookevink avatar Sep 13 '23 01:09 lookevink

Also looking forward to the fine-tuning code for VQA :)

wendyunji avatar Sep 24 '23 07:09 wendyunji

does anybody know if code for BLIP2 VQA finetuning is available? /thanks

hannahgym avatar Nov 26 '23 02:11 hannahgym

I know, no, obviously.

18445864529 avatar Nov 28 '23 15:11 18445864529

Hi everyone. I have implemented the BLIP-VQA-BASE model for the VQA task here. I hope this implementation can help you and this implementation will receive some advice.

dino-chiio avatar Dec 05 '23 06:12 dino-chiio

does anybody know if code for BLIP2 VQA finetuning is available? /thanks

hi, have you implemented fine-tune blip2 on the vqa task?

WildLight avatar Apr 27 '24 11:04 WildLight