Junbin Xiao
Junbin Xiao
Hi, thanks for the interest. We focus on the multi-chioce setting in NExT-QA, so there is no global ans_words in nextqa folder.
You can set multi_choice = True in main_qa.py to avoid possible issues.
hi, please use train.trainval_gdqa by default for one-stage training.
Hi, thanks for the interest. I have uploaded the related code (for reference only). To extract region feature, you need to sample frames in the same way and use the...
Bascially, you can follow a coarse pipeline: extract_video.py (decode mp4 into frames)->preprocess_feature.py (sample and encode frames into CNN representations)->split_dataset_feat.py(split the feature into train/val/test).
Please choose resnet-101 with d2.
Please consider changing the hyper-parameter for object number in .sh file with bnum=5 or 10 depends on specific dataset.
Hi, please find the feature for STAR [here](https://drive.google.com/file/d/1a-wZ5S6Xk7g765E7Ny8a6MRDfDpAj1bY/view?usp=sharing).
Hi, we use ffmpeg and decode each video ( or QA related segment for STAR) at 3pfs.
It shoud be the original scale.