lxmert
lxmert copied to clipboard
The Hyper parameters for the VizWiz datasets
Dear Pro: I read about the Vizwiz Leaderboard for ECCV 2018. The results shown are 55.40 for no model ensemble. But I trained the Vizwiz datasets and the results are only 51.96. So I want to know how the results different. The answer wocabulary for the Vizwiz dataset are chosen according to the most common 3000 categories. The initial lr rate is 5e-5, epochs are 4 and batchsize is 32. The pretraining model I used is the Epoch20_LXRT.pth. So if convenient, could you share your Hyper parameters for the Vizwiz datasets?
Could you try this one that I used to submit the leaderboard entry?
BatchSize 64,
LR 1e-4,
Epochs 20 (Vizwiz is super small...
One epoch takes around 10 mins while VQA takes 1.5 hours,
we thus increase the number of epochs)
OK! I would try it soon! Thanks a lot! But I still have 2 questions for the training. Looking forward to your reply.
- How do you deal with the answer labels? You know,every question has 10 answers. But it has no score for different answers like VQA. So how do you deal with the answer labels?
- The loss function issue. I choose Soft loss function used in https://github.com/DenisDsh/VizWiz-VQA-PyTorch/blob/master/train.py . But I do not know how you choose the loss function. Still the Crossentropy?
Thanks. I have uploaded the materials here: http://nlp.cs.unc.edu/data/lxmert_data/vizwiz/vizwiz.zip. You could kindly take a look.
For the loss function, I just used CrossEntropy as VQA/GQA.
Sorry to trouble you again.. When I use the materials above, there exists the KeyError: target[self.raw_dataset.ans2label[ans]] = score KeyError: '1 package stouffer signature classics fettuccini alfredo' But I do not find the solution because the key is in the dict. So could you help me find this?
I think that I just remove the answer if it is not in the dict.
OK!I found it! Thanks a lot!!
I checked the test file and found the test files have been changed. And I wanted to use your docker but the pretrained model link below is out-of-date. https://www.dropbox.com/s/nu6jwhc88ujbw1v/resnet101_faster_rcnn_final_iter_320000.caffemodel?dl=1
So could you use your model to generate the new test data? Thanks a lot!
The new dropbox link of the model is updated on bottom-up-attention repo and is available [here](alternative pretrained model).
OK! Thanks a lot!! I wonder how you change the answers to the labels, especially adding the label confidence.
This part is almost the same as the previous VQA pre-processing. You could read this repo for details.