poselstm-pytorch
poselstm-pytorch copied to clipboard
PyTorch implementation of PoseLSTM and PoseNet
PoseLSTM and PoseNet implementation in PyTorch
This is the PyTorch implementation for PoseLSTM and PoseNet, developed based on Pix2Pix code.
Prerequisites
- Linux
- Python 3.5.2
- CPU or NVIDIA GPU + CUDA CuDNN
Getting Started
Installation
- Install PyTorch and dependencies from http://pytorch.org
- Clone this repo:
git clone https://github.com/hazirbas/posenet-pytorch
cd posenet-pytorch
pip install -r requirements.txt
PoseNet train/test
- Download a Cambridge Landscape dataset (e.g. KingsCollege) under datasets/ folder.
- Compute the mean image
python util/compute_image_mean.py --dataroot datasets/KingsCollege --height 256 --width 455 --save_resized_imgs
- Train a model:
python train.py --model posenet --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --beta 500 --gpu 0
- To view training errors and loss plots, set
--display_id 1, runpython -m visdom.serverand click the URL http://localhost:8097. Checkpoints are saved under./checkpoints/posenet/KingsCollege/beta500/. - Test the model:
python test.py --model posenet --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --gpu 0
The test errors will be saved to a text file under ./results/posenet/KingsCollege/beta500/.
PoseLSTM train/test
- Train a model:
python train.py --model poselstm --dataroot ./datasets/KingsCollege --name poselstm/KingsCollege/beta500 --beta 500 --niter 1200 --gpu 0
- Test the model:
python test.py --model poselstm --dataroot ./datasets/KingsCollege --name poselstm/KingsCollege/beta500 --gpu 0
Initialize the network with the pretrained googlenet trained on the Places dataset
If you would like to initialize the network with the pretrained weights, download the places-googlenet.pickle file under the pretrained_models/ folder:
wget https://vision.in.tum.de/webarchive/hazirbas/poselstm-pytorch/places-googlenet.pickle
Optimization scheme and loss weights
- We use the training scheme defined in PoseLSTM
- Note that mean subtraction is not used in PoseLSTM models
- Results can be improved with a hyper-parameter search
| Dataset | beta | PoseNet (CAFFE) | PoseNet | PoseLSTM (TF) | PoseLSTM |
|---|---|---|---|---|---|
| King's College | 500 | 1.92m 5.40° | 1.19m 4.51° | 0.99m 3.65° | 0.90m 3.96° |
| Old Hospital | 1500 | 2.31m 5.38° | 1.91m 4.05° | 1.51m 4.29° | 1.79m 4.28° |
| Shop Façade | 100 | 1.46m 8.08° | 1.30m 8.13° | 1.18m 7.44° | 0.98m 6.20° |
| St Mary's Church | 250 | 2.65m 8.48° | 1.89m 7.27° | 1.52m 6.68° | 1.68m 6.41° |
Citation
@inproceedings{PoseNet15,
title={PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization},
author={Alex Kendall, Matthew Grimes and Roberto Cipolla },
journal={ICCV},
year={2015}
}
@inproceedings{PoseLSTM17,
author = {Florian Walch and Caner Hazirbas and Laura Leal-Taixé and Torsten Sattler and Sebastian Hilsenbeck and Daniel Cremers},
title = {Image-based localization using LSTMs for structured feature correlation},
month = {October},
year = {2017},
booktitle = {ICCV},
eprint = {1611.07890},
url = {https://github.com/NavVisResearch/NavVis-Indoor-Dataset},
}
Acknowledgments
Code is inspired by pytorch-CycleGAN-and-pix2pix.