chihyaoma/temporal-augmentation: Temporal augmentation with two-stream ConvNet feat...

Temporal Augmentation using frame-level features with RNN on UCF101

Two-stream ConvNet has been recognized as one of the most deep ConvNet on video understanding, specifically human action recogniton. However, it suffers from the insufficient temporal datas for training.

This repository aims to implement the temporal segments RNN for training on vidoes with temporal augmentation. The implementation is based on example code from fb.resnet.torch, and was largely modified in order to work with frame level features.

Pre-saved features generated from ResNet-101 is provided.

Prerequisites

Linux (tested on Ubuntu 14.04)
Torch
CUDA and cuDNN
NVIDIA GPU is strongly recommended

Video Dataset

UCF101

The start code provided here should be relatively easy to adapt for other dataset. For example:

Features for training

I re-trained the two-stream ConvNet using pre-trained ResNet-101 on the UCF101 datasets. Please download the frame level features from the links below.

The features are coming soon.

UCF-101 split 1

Spatial-steam and Temporal-steam ConvNet features (34.7GB)

You can certainly generate features for split 2 and 3 by rearranging the features according the split list provided by UCF101.

Usage

Specify the downloaded features and the types of RNN model you would like to use in opt.lua.

th main.lua

Citation

Please cite our paper, if you think the codes are useful.

TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition

@article{ma2017tslstm,
  title={TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition},
  author={Ma, Chih-Yao and Chen, Min-Hung and Kira, Zsolt and AlRegib, Ghassan},
  journal={arXiv preprint arXiv:1703.10667},
  year={2017}
}

temporal-augmentation
temporal-augmentation copied to clipboard

Metadata

Temporal Augmentation using frame-level features with RNN on UCF101

Prerequisites

Video Dataset

Features for training

Usage

Citation

← Metadata

Owner

Metadata

temporal-augmentation temporal-augmentation copied to clipboard

Metadata

Temporal Augmentation using frame-level features with RNN on UCF101

Prerequisites

Video Dataset

Features for training

Usage

Citation

← Metadata

Owner

Metadata

temporal-augmentation
temporal-augmentation copied to clipboard