HieCoAttenVQA icon indicating copy to clipboard operation
HieCoAttenVQA copied to clipboard

run prepro_vqa.py error when split=2

Open xhzhao opened this issue 8 years ago • 16 comments

I got this error when split=2 while split=1 work very well. the command is : python vqa_preprocess.py --download 1 --split 2 python prepro_vqa.py --input_train_json ../data/vqa_raw_train.json --input_test_json ../data/vqa_raw_test.json --num_ans 1000

the error is : top words and their counts:9.88% done)
(320161, '?') (225976, 'the') (200545, 'is') (118203, 'what') (76624, 'are') (64512, 'this') (49209, 'in') (45681, 'a') (41629, 'on') (40158, 'how') (38230, 'many') (37322, 'color') (37023, 'of') (29182, 'there') (18392, 'man') (14668, 'does') (13492, 'people') (12518, 'picture') (11779, "'s") (11758, 'to') total words: 2284620 number of bad words: 0/14770 = 0.00% number of words in vocab would be 14770 number of UNKs: 0/2284620 = 0.00% inserting the special UNK token Traceback (most recent call last): File "prepro_vqa.py", line 292, in main(params) File "prepro_vqa.py", line 217, in main ans_test = encode_answer(imgs_test, atoi) File "prepro_vqa.py", line 128, in encode_answer ans_arrays[i] = atoi.get(img['ans'], -1) # -1 means wrong answer. KeyError: 'ans'

xhzhao avatar Dec 19 '16 05:12 xhzhao

there is no 'ans' key on split 2, you should modify lines that ask it.

idansc avatar Dec 31 '16 14:12 idansc

@idansc Thank you, i fixed this code bug, and download the pretrained model from here(https://filebox.ece.vt.edu/~jiasenlu/codeRelease/co_atten/model/vqa_model/model_alternating_train-val_vgg.t7). I submitted the result with the name: vqa_OpenEnded_mscoco_test-dev2015_HieCoAttenVQA_results.json, and i got the accuracy like this: {"overall": 43.11, "perAnswerType": {"other": 15.74, "number": 29.76, "yes/no": 78.73}}

But the overall accuracy in the paper is 60.1%, i really don't know where this gap is comes from?

xhzhao avatar Jan 10 '17 08:01 xhzhao

Did they provide the json, and h5 files as well? the model need to be aligned with the pre-processed files.

idansc avatar Jan 10 '17 08:01 idansc

yeah, the json is provided, and the h5 file is generated by myself: th prepro_img_vgg.lua -input_json ../data/vqa_data_prepro.json -image_root /home/jiasenlu/data/ -cnn_proto ../image_model/VGG_ILSVRC_19_layers_deploy.prototxt -cnn_model ../image_model/VGG_ILSVRC_19_layers.caffemodel

xhzhao avatar Jan 10 '17 08:01 xhzhao

that's ok for the images, but what about the h5file containing the preprocessed question dataset? if it's not provided it will cause problems on training (to be able to fit the right answers to questions) I believe the gap caused by some sort of misalignment in the pre-processing.

idansc avatar Jan 10 '17 20:01 idansc

@idansc any idea about how to fix this misalignment problem?

xhzhao avatar Jan 12 '17 07:01 xhzhao

did you try to training the model yourself? by the way are you running on cpu or gpu?

idansc avatar Jan 15 '17 08:01 idansc

I have tried to use GPU(M40) to train this model, but the training process is very slow(12.5 hour / 1 epoch, 250epoch is used in the paper), and i'm trying to find out where is the bottleneck

xhzhao avatar Jan 16 '17 00:01 xhzhao

epoch should be a few mins with M40, check that your stored cnn features are on SSD, or use DataLoader if you have enough RAM (about 60GB)

idansc avatar Jan 16 '17 00:01 idansc

yes, it should be. I trained another model based on another github and the speed is very fast, while the accuracy is not good enough: here. I will double check the hardware.

xhzhao avatar Jan 16 '17 00:01 xhzhao

@xhzhao , @idansc , sorry for offtopic, but really need some help here. I trained the model based on customized VQA, but not sure how to run the evaluation now. I read the readme, but it's not clear from there. I would highly appreciate any help and if you are open for discussion, I guess we can continue here.

yauhen-info avatar Jan 25 '17 22:01 yauhen-info

@xhzhao I met the same problem with you, but I have no idea to deal with it. How did you solve it?

lupantech avatar Mar 13 '17 18:03 lupantech

OK, I made it :)

lupantech avatar Mar 14 '17 07:03 lupantech

@lupantech @yauhen-info @xhzhao How do you solved the problem? I am using the VQA v1.9 dataset, the eval.lua provided by HieCoAttenVQA, and the vqaEvalDemo.py provided by VT-vision-lab/VQA, it will report a error at vqa.py as 'Results do not correspond to current VQA set. Either the results do not have predictions for all question ids in annotation file or there is at least one question id that does not belong to the question ids in the annotation file.'. I see one suggestion that to use the eval.lua in VT-vision-lab/VQA_LSTM_CNN, but it is not well suitable for the one in HieCoAttenVQA. Thanks!

panfengli avatar Apr 13 '17 04:04 panfengli

I find the error is due to the new question_id in VQA dataset v1.9 exceeds the volume of FloatTensor, which will cause mismatch when copy from data.ques_id. Simply change one line code will work, as local ques_id = torch.DoubleTensor(total_num)

panfengli avatar Apr 13 '17 19:04 panfengli

@panfengli which file is the line "ques_id = torch.DoubleTensor(total_num)" in ?

woshiacai avatar Oct 06 '17 03:10 woshiacai