Oscar
Oscar copied to clipboard
VQA object tags are different from image feature
Hi, I am currently working on VQA datasets.
The VQA fine-tune Oscar-base script from VinVL_MODEL_ZOO.md
use --data_label_type mask
, so it will use the text data from train2014_qla_mrcnn.json
downloaded from https://biglmdiag.blob.core.windows.net/vinvl/datasets/vqa
I found that the object tags in train2014_qla_mrcnn.json
are different from the prediction.tsv
downloaded from pre-exacted COCO 2014 Train/Val Image Features (~50G). But the img_features length are the same.
Because the script use--img_feature_type faster_r-cnn
and --data_label_type mask
. I guess the input object tags(text) use tags from mask
and the image feature use the feature from faster_r-cnn
.
Can you explain the design choice? Do you have the experiment result of --img_feature_type faster_r-cnn
and --data_label_type faster
?
Thanks!
Excuse me, may I ask whether you have these files train+val2014_qla_mrcnn.json
, test2015_qla_mrcnn.json
and test-dev2015_qla_mrcnn.json
? I found these files are missing, making it difficult for inference and official evaluation.
Excuse me, may I ask whether you have these files
train+val2014_qla_mrcnn.json
,test2015_qla_mrcnn.json
andtest-dev2015_qla_mrcnn.json
? I found these files are missing, making it difficult for inference and official evaluation.
No, they didn't provide in DOWNLOAD. I think we should create them by ourselves somehow.
In this closed issue (#13), I noticed the author has mentioned the way to generate the mask-rcnn-based object labels. I tried to reproduce the labels on the VQA training images. My generated labels are similar but still with some differences compared with the released image labels. I'm not sure whether these generated labels can reproduce the same VQA scores.
I have exactly the same question. I am so confused about which image features are used for VQA fine-tuning. Whether with predictions.tsv (VinVL features), image_feature_type(faster_r-cnn), or data_label_type(mask r-cnn? https://github.com/microsoft/Oscar/issues/13#issuecomment-645809973_) Have you figured it out? Many thanks!
Same question