Junbin Xiao
Junbin Xiao
Hi, please refer to the link given in readme & our paper. Answers are appended behind the corresponding question for multi-choice QA.
Hi, we use I3D with ResNeXt as backbone to capture motion info. The code can also be found in HCRN. The number of sampled clips depends on your dataset, usually...
You need to finetune BERT on your own dataset, and then extract token representations for sentences. Afterwards, you can use the extracted BERT features to replace the GloVe embedding layer...
> > Hi, we use I3D with ResNeXt as backbone to capture motion info. The code can also be found in HCRN. The number of sampled clips depends on your...
Hi, we have released the edited code for fintuning BERT on NExT-QA [here](https://drive.google.com/file/d/1Z0RMnIJrqQcFQEhhuhiRvDvyRmaKdPbg/view?usp=sharing). You can also fine-tune other datasets by using the code.
Yes. Please download it via this [link](https://drive.google.com/file/d/1_wwJrB7r974Eq3VkXUnUlPhBwQRDLMbZ/view?usp=share_link).
It is to split the dataset for better storage & I/O and online sharing..
Hi, please do not change the evaluation file. Basically, every video-question pair can get a prediction, if not, the problem should be in the prediction part but not evaluation. I...
The code was tested on TITAN XP & V100 with pytorch 1.6.0. Cuda version can be 10.2 or 11.0/1/5.. It does not make sense to change the name; the feature...
Hi, pls refer to ```build_vocab.py``` and ```word2vec.py```.