Recurrent_Fusion_Network
Recurrent_Fusion_Network copied to clipboard
Source code for "Recurrent Fusion Network for Image Captioning".
Recurrent Fusion Network for Image Captioning
This repository includes the implementations for Recurrent Fusion Network for Image Captioning.
Requirements
- Python 3.6
- PyTorch 0.3.1
- Java
Training
0. Feature extraction
All scripts for feature extraction are included in data/feature_extraction. Please generate flipped and cropped images to perform data augmentation, download pre-trained models and extract features with the scripts. All extracted featuers should be put in the data directory.
1. Train with cross entropy loss
bash train_recurrent_fusion_model.sh
2. Training with reinforcement learning
bash train_recurrent_fusion_model_rl.sh
Evaluation
Evaluate with eval_single.sh and eval_ensemble.sh to obtain metric scores for single model and ensemble of multiple models, respectively.
Reference
If you find this repo useful, please consider citing:
@InProceedings{Jiang_2018_ECCV,
author = {Jiang, Wenhao and Ma, Lin and Jiang, Yu-Gang and Liu, Wei and Zhang, Tong},
title = {Recurrent Fusion Network for Image Captioning},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}
Acknowledgements
Our code is based on Ruotian Luo's implementation ans is reorganized by Zhiming Ma.