DMRM
DMRM copied to clipboard
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
Pytorch Implementation for the paper:
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog
Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu,and Jie Zhou
In AAAI 2020
Setup and Dependencies
This code is implemented using PyTorch v0.3.0 with CUDA 8 and CuDNN 7.
It is recommended to set up this source code using Anaconda or Miniconda.
- Install Anaconda or Miniconda distribution based on Python 3.6+ from their downloads' site.
- Clone this repository and create an environment:
git clone https://github.com/phellonchen/DMRM.git
conda create -n dmrm_visdial python=3.6
# activate the environment and install all dependencies
conda activate dmrm_visdial
cd $PROJECT_ROOT/
pip install -r requirements.txt
Download Features
-
Download the VisDial dialog json files from here and keep it under
$PROJECT_ROOT/datadirectory, for default arguments to work effectively. -
We used the Faster-RCNN pre-trained with Visual Genome as image features. Download the image features below, and put each feature under
$PROJECT_ROOT/datadirectory.
features_faster_rcnn_x101_train.h5: Bottom-up features of 36 proposals from images oftrainsplit.features_faster_rcnn_x101_val.h5: Bottom-up features of 36 proposals from images ofvalsplit.features_faster_rcnn_x101_test.h5: Bottom-up features of 36 proposals from images oftestsplit.
- Download the GloVe pretrained word vectors from here, and keep
glove.6B.300d.txtunder$PROJECT_ROOT/datadirectory.
Data preprocessing & Word embedding initialization
# data preprocessing
cd $PROJECT_ROOT/script/
python prepro.py
# Word embedding vector initialization (GloVe)
cd $PROJECT_ROOT/script/
python create_glove.py
Training
Simple run
python main_v0.9.py or python main_v1.0.py
Saving model checkpoints
Our model save model checkpoints at every epoch and undate the best one. You can change it by editing the train.py.
Logging
Logging data $PROJECT_ROOT/save_models/time/log.txt shows epoch, loss, and learning rate.
Evaluation
Evaluation of a trained model checkpoint can be evaluated as follows:
python eval_v0.9.py or python eval_v1.0.py
Results
Performance on v0.9 val-std (trained on v0.9 train):
| Model | MRR | R@1 | R@5 | R@10 | Mean |
|---|---|---|---|---|---|
| DMRM | 55.96 | 46.20 | 66.02 | 72.43 | 13.15 |
Performance on v1.0 val-std (trained on v1.0 train):
| Model | MRR | R@1 | R@5 | R@10 | Mean |
|---|---|---|---|---|---|
| DMRM | 50.16 | 40.15 | 60.02 | 67.21 | 15.19 |
If you find this repository useful, please consider citing our work:
@inproceedings{chen2020dmrm,
title={DMRM: A dual-channel multi-hop reasoning model for visual dialog},
author={Chen, Feilong and Meng, Fandong and Xu, Jiaming and Li, Peng and Xu, Bo and Zhou, Jie},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={34},
number={05},
pages={7504--7511},
year={2020}
}
License
MIT License