Sequence-Level-Semantics-Aggregation
Sequence-Level-Semantics-Aggregation copied to clipboard
Sequence Level Semantics Aggregation for Video Object Detection
Sequence Level Semantics Aggregation for Video Object Detection
Introduction
This is an official MXNet implementation of Sequence Level Semantics Aggregation for Video Object Detection. (ICCV 2019, oral). SELSA aggregates full-sequence level information of videos while keeping a simple and clean pipeline. It achieves 82.69 mAP with ResNet-101 on ImageNet VID validation set.
Citation
If you use the code or models in your research, please cite with:
@article{wu2019selsa,
title={Sequence Level Semantics Aggregation for Video Object Detection},
author={Wu, Haiping and Chen, Yuntao and Wang, Naiyan and Zhang, Zhaoxiang},
journal={ICCV 2019},
year={2019}
}
Main Results
training data | testing data | mAP(%) | mAP(%)(slow) | mAP(%)(medium) | mAP(%)(fast) | |
---|---|---|---|---|---|---|
Single-frame baseline(Faster R-CNN, ResNet-101) | ImageNet DET train + VID train | ImageNet VID validation | 73.6 | 82.1 | 71.0 | 52.5 |
SELSA(Faster R-CNN, ResNet-101) | ImageNet DET train + VID train | ImageNet VID validation | 80.3 | 86.9 | 78.9 | 61.4 |
SELSA(Faster R-CNN, ResNet-101, Data Aug) | ImageNet DET train + VID train | ImageNet VID validation | 82.7 | 88.0 | 81.4 | 67.1 |
Installation
Please note that this repo is based on Python 2.
- Clone the repository.
git clone https://github.com/happywu/Sequence-Level-Semantics-Aggregation
-
Install MXNet following https://mxnet.incubator.apache.org/get_started. We tested our code on MXNet v1.3.0.
-
Install packages via
pip install -r requirements.txt
sh init.sh
Preparation for Training & Testing
-
Please download ILSVRC2015 DET and ILSVRC2015 VID dataset, and make sure it looks like this:
./data/ILSVRC2015/ ./data/ILSVRC2015/Annotations/DET ./data/ILSVRC2015/Annotations/VID ./data/ILSVRC2015/Data/DET ./data/ILSVRC2015/Data/VID ./data/ILSVRC2015/ImageSets
-
Please download ImageNet pre-trained ResNet-v1-101 model and our pretrained SELSA ResNet-101 model manually, and put it under folder
./model
. Make sure it looks like this:./model/pretrained_model/resnet_v1_101-0000.params ./model/pretrained_model/selsa_rcnn_vid-0000.params
Testing
- To test the provided pretrained model, run the following command.
python experiments/selsa/test.py --cfg experiments/selsa/cfgs/resnet_v1_101_rcnn_selsa_aug.yaml --test-pretrained ./model/pretrained_model/selsa_rcnn_vid
You should get the results as reported before.
Training
-
To train, use the following command
python experiments/selsa/train_end2end.py --cfg experiments/selsa/cfgs/resnet_v1_101_rcnn_selsa_aug.yaml
A cache folder would be created automatically to save the model and the log under
output/selsa_rcnn/imagenet_vid/
. -
To test your trained model
python experiments/selsa/test.py --cfg experiments/selsa/cfgs/resnet_v1_101_rcnn_selsa_aug.yaml
Other implementations
Pytorch: MMTracking
Acknowledge
This repo is modified from Flow-Guided-Feature-Aggregation.