OFA How to train OFA for VQA in open-ended?

Dear authors: Thanks for the great work! In VQA validation, If I want the model to predict the most likely next token (i.e. generating a token in the answer) from the output logits. And then I append this token to the input and repeat this step until the model predicts ⟨EOS⟩. What could I do to achieve it? Thanks a lot!

Jun 10 '22 16:06 qyc-98

And I want to train and validate both in this manner. Thanks for your precious time!

Jun 11 '22 05:06 qyc-98

Hi, currently the VQA task code supports beam-search inference during validation and testing (in contrast with all-candidate inference, please refer to readme), but the finetuning objective still must be constrained with a pre-defined candidate answer set stored in trainval_ans2label.pkl file. We are working to add a new config to support unconstrained finetuning (which does not need a pre-defined candidate answer set). The code update is still under testing and will be merged in this week.

Jun 14 '22 08:06 yangapku

@yangapku Hi, any updates on this? Thanks!

Jul 07 '22 06:07 ilovecv

Hi, a pull request related to this issue #124 has been proposed recently, which will add a new config to activate unconstrained finetuning. However, we find bugs are still existing in this PR, which will result in zero score during evaluation. We are still working on making it function correctly and will merge it ASAP.

Jul 22 '22 09:07 yangapku

Hi, Thanks for your great job!

Jul 26 '22 08:07 qyc-98

Any update on this?

Aug 09 '22 20:08 RishabhMaheshwary

@qyc-98 @RishabhMaheshwary @ilovecv Hi, we have found the bug and fixed it! Now the latest codebase supports open-ended (unconstrained) VQA finetuning and evaluation. Please pull the latest code and refer to PR #124 & run_scripts/vqa/train_vqa_distributed.sh (Line 62-68) on how to activate it!

Sep 21 '22 06:09 yangapku

Hi, are there any performance data for the open-ended VQA fine-tuning?

Sep 21 '22 08:09 leng-yue

@leng-yue We have tested open-ended VQA fine-tuning on OFA-base (without using EMA). It achieves 76.4 score on our VQA validation set. This performance can still be improved by using EMA and further hyper-param tuning.

Sep 21 '22 09:09 yangapku

Thanks for your response, the result looks good :)

Sep 22 '22 00:09 leng-yue