hscnet
hscnet copied to clipboard
Hierarchical Scene Coordinate Classification and Regression for Visual Localization
Hierarchical Scene Coordinate Classification and Regression for Visual Localization
This is the PyTorch implementation of our paper, a hierarchical scene coordinate prediction approach for one-shot RGB camera relocalization:
Hierarchical Scene Coordinate Classification and Regression for Visual Localization, CVPR 2020
Xiaotian Li, Shuzhe Wang, Yi Zhao, Jakob Verbeek, Juho Kannala
Setup
Python3 and the following packages are required:
cython
numpy
pytorch
opencv
tqdm
imgaug
It is recommended to use a conda environment:
- Install anaconda or miniconda.
- Create the environment:
conda env create -f environment.yml
. - Activate the environment:
conda activate hscnet
.
To run the evaluation script, you will need to build the cython module:
cd ./pnpransac
python setup.py build_ext --inplace
Data
We currently support 7-Scenes, 12-Scenes, Cambridge Landmarks, and the three combined scenes which have been used in the paper. We will upload the code for the Aachen Day-Night dataset experiments.
You will need to download the datasets from the websites, and we provide a data package which contains other necessary files for reproducing our results. Note that for the Cambridge Landmarks dataset, you will also need to rename the files according to the train/test.txt
files and put them in the train/test
folders. And the depth maps we used for this dataset are from DSAC++. The provided label maps are obtained by running k-means hierarchically on the 3D points.
Evaluation
The trained models for the main experiments in the paper can be downloaded here.
To evaluate on a scene from a dataset:
python eval.py \
--model [hscnet|scrnet] \
--dataset [7S|12S|Cambridge|i7S|i12S|i19S] \
--scene scene_name \
--checkpoint /path/to/saved/model/ \
--data_path /path/to/data/
Training
You can train the hierarchical scene coordinate network or the baseline regression network by running the following command:
python train.py \
--model [hscnet|scrnet] \
--dataset [7S|12S|Cambridge|i7S|i12S|i19S] \
--scene scene_name \ # not required for the combined scenes
--n_iter number_of_training_iterations \
--data_path /path/to/data/
License
Copyright (c) 2020 AaltoVision.
This code is released under the MIT License.
Acknowledgements
The PnP-RANSAC pose solver builds on DSAC++. The sensor calibration file and the normalization translation files for the 7-Scenes dataset are from DSAC. The rendered depth images for the Cambridge Landmarks dataset are from DSAC++.
Citation
Please consider citing our paper if you find this code useful for your research:
@inproceedings{li2020hscnet,
title = {Hierarchical Scene Coordinate Classification and Regression for Visual Localization},
author = {Li, Xiaotian and Wang, Shuzhe and Zhao, Yi and Verbeek, Jakob and Kannala, Juho},
booktitle = {CVPR},
year = {2020}
}