activitynet-qa icon indicating copy to clipboard operation
activitynet-qa copied to clipboard

An VideoQA dataset based on the videos from ActivityNet

Activitynet-QA

The ActivityNet-QA dataset contains 58,000 human-annotated QA pairs on 5,800 videos derived from the popular ActivityNet dataset. The dataset provides a benckmark for testing the performance of VideoQA models on long-term spatio-temporal reasoning.

Dataset

The dataset folder contains the json files for the questions and answers. We do not maintain the raw video files, and video files can be obtained from the official website: ActivityNet 200 (v1.3)

Evaluation

We provide a simple script and a exmaple prediction json file under the evaluation folder to calculate the accuracy per type.

python evaluation/eval.py --pred_file evaluation/pred_val_example.json --gt_file dataset/val_a.json

Licence

The code and the dataset are distributed under MIT LICENSE. They are only allowed for non-commercial use.

Citation

If the project are helpful for your research, please cite

@inproceedings{yu2019activityqa,
    author = {Yu, Zhou and Xu, Dejing and Yu, Jun and Yu, Ting and Zhao, Zhou and Zhuang, Yueting and Tao, Dacheng},
    title = {ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering},
    booktitle = {AAAI},
    pages = {9127--9134},
    year = {2019}
}