lxmert The Hyper parameters for the VizWiz datasets

Dear Pro: I read about the Vizwiz Leaderboard for ECCV 2018. The results shown are 55.40 for no model ensemble. But I trained the Vizwiz datasets and the results are only 51.96. So I want to know how the results different. The answer wocabulary for the Vizwiz dataset are chosen according to the most common 3000 categories. The initial lr rate is 5e-5, epochs are 4 and batchsize is 32. The pretraining model I used is the Epoch20_LXRT.pth. So if convenient, could you share your Hyper parameters for the Vizwiz datasets?

May 12 '20 09:05 runzeer

Could you try this one that I used to submit the leaderboard entry?

BatchSize 64,
LR 1e-4,
Epochs 20 (Vizwiz is super small... 
        One epoch takes around 10 mins while VQA takes 1.5 hours, 
        we thus increase the number of epochs)

May 12 '20 15:05 airsplay

OK! I would try it soon! Thanks a lot! But I still have 2 questions for the training. Looking forward to your reply.

How do you deal with the answer labels? You know,every question has 10 answers. But it has no score for different answers like VQA. So how do you deal with the answer labels?
The loss function issue. I choose Soft loss function used in https://github.com/DenisDsh/VizWiz-VQA-PyTorch/blob/master/train.py . But I do not know how you choose the loss function. Still the Crossentropy?

May 13 '20 00:05 runzeer

Thanks. I have uploaded the materials here: http://nlp.cs.unc.edu/data/lxmert_data/vizwiz/vizwiz.zip. You could kindly take a look.

For the loss function, I just used CrossEntropy as VQA/GQA.

May 13 '20 01:05 airsplay

Sorry to trouble you again.. When I use the materials above, there exists the KeyError: target[self.raw_dataset.ans2label[ans]] = score KeyError: '1 package stouffer signature classics fettuccini alfredo' But I do not find the solution because the key is in the dict. So could you help me find this?

May 13 '20 02:05 runzeer

I think that I just remove the answer if it is not in the dict.

May 13 '20 02:05 airsplay

OK！I found it！ Thanks a lot!!

May 13 '20 02:05 runzeer

I checked the test file and found the test files have been changed. And I wanted to use your docker but the pretrained model link below is out-of-date. https://www.dropbox.com/s/nu6jwhc88ujbw1v/resnet101_faster_rcnn_final_iter_320000.caffemodel?dl=1

So could you use your model to generate the new test data? Thanks a lot!

May 13 '20 08:05 runzeer

The new dropbox link of the model is updated on bottom-up-attention repo and is available [here](alternative pretrained model).

May 13 '20 14:05 airsplay

OK! Thanks a lot!! I wonder how you change the answers to the labels, especially adding the label confidence.

May 14 '20 02:05 runzeer

This part is almost the same as the previous VQA pre-processing. You could read this repo for details.

May 14 '20 03:05 airsplay

lxmert lxmert copied to clipboard

The Hyper parameters for the VizWiz datasets

lxmert
lxmert copied to clipboard