L-GCN
L-GCN copied to clipboard
PyTorch implementation of L-GCN [https://arxiv.org/abs/2008.09105]
Location-aware Graph Convolutional Networks for Video Question Answering
This repo holds the codes for the L-GCN framework presented on AAAI 2020
Location-aware Graph Convolutional Networks for Video Question Answering Deng Huang, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan, Chuang Gan, AAAI 2020, New York.
[Paper]
Contents
-
Usage Guide
- Code Preparation
- Module Preparation
- Data Preparation
- Training
-
Other Info
- Citation
- Contact
Usage Guide
Code Preparation [back to top]
Clone this repo with git
git clone https://github.com/SunDoge/L-GCN.git
cd L-GCN
Module Preparation [back to top]
This repo is based on Pytorch>=1.2
Other modules can be installed by running
pip install -r requirements.txt
python -m spacy download en
Data Preparation [back to top]
Data Processing
Save frames
Extract frames by following the instructions in tgif-qa.
./save-frames.sh data/tgif/{gifs,frames}
Some GIF cannot be read by ffmpeg, you can use imagemagick to save the frames.
convert img.gif img/%d.jpg
Split frames
Since there are too many frames to process, we split them into N parts.
python -m scripts.split_n_parts -o data/tgif/frame_splits/
Get bboxes
Extract bboxes using Mask R-CNN. Check the script for more args.
python -m scripts.extract_bboxes_with_maskrcnn \
-f data/tgif/frame_splits/split0.pkl \
-o data/tgif/bboxes_splits/split0.pt \
-c /path/to/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml
Merge bboxes
python -m scripts.merge_box_scores_and_labels \
--bboxes data/tgif/bboxes_splits \
-o data/tgif/bboxes
Extract bbox features
python -m scripts.extract_resnet152_features_with_bboxes \
-i data/tgif/frames \
-f data/tgif/frame_splits/split0.pkl \
-p data/tgif/bboxes_splits/split0.pt \
-o data/tgif/bbox_features_splits/split0layer4
Merge bbox features
python -m scripts.merge_bboxes \
--bboxes data/tgif/bbox_features_splits \
-o data/tgif/resnet152_bbox_features
Extract pool5 features
python -m scripts.extract_resnet152_features \
-i data/tgif/frames
Training [back to top]
Use the following command to train L-GCN
python train.py -c config/resnet152-bbox/$TASK_CONFIG -e $PATH_TO_SAVE_RESULT
-
$TASK_CONFIG
denotes the config of task, there are four choice:action.conf
,transition.conf
,frameqa.conf
,count.conf
-
$PATH_TO_SAVE_RESULT
denotes the path to save the result
Other Info
Citation [back to top]
Please cite the following paper if you feel L-GCN useful to your research
@inproceedings{L-GCN2020AAAI,
author = {Deng Huang and
Peihao Chen and
Runhao Zeng and
Qing Du and
Mingkui Tan and
Chuang Gan},
title = {Location-aware Graph Convolutional Networks for Video Question Answering},
booktitle = {AAAI},
year = {2020},
}
Contact [back to top]
For any question, please file an issue or contact
[email protected]
[email protected]