PoSFeat icon indicating copy to clipboard operation
PoSFeat copied to clipboard

[CVPR2022] Decoupling Makes Weakly Supervised Local Feature Better

Decoupling Makes Weakly Supervised Local Feature Better (PoSFeat)

This is the official implementation of PoSFeat (CVPR2022), a weakly supervised local feature training framework.

Decoupling Makes Weakly Supervised Local Feature Better
Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, Yulan Guo*
[Paper] [Arxiv] [Blog] [Bilibili] [Youtube]


We decoupled the description net training and detection net training, and postpone the detection net training. This simple but effective framework allows us to detect robust keypoints based on the optimized descriptors.


(1) Download training data

下载CAPS处理好的MegaDepth子集。注意,如果要参加IMC,需要手动去掉一些场景(0008 0021 0024 0063 1589)。

Down the preprocessed subset of MegaDepth from CAPS. If you want to test the local feature on IMC, please manually remove the banned scenes (0008 0021 0024 0063 1589).

(2) Train the description net



To start the description net training, please mannuly modify the data_path of data_config_train in config/train_desc.yaml.

Because of unknown reason, the multi-gpu training is really slow, so we should set single GPU available



Then run the following command

python train.py --config ./configs/train_desc.yaml


It takes about 24 hours to finish description net training on a single NVIDIA RTX3090 GPU.

(3) Train the detection net


Similarly, modify the datapath and set single GPU available



And run the command

python train.py --config ./configs/train_kp.yaml

(4) The difference between the results trained with this code repo and in the paper


In the paper, we use SGD optimizer with lr=1e-3 to train the model, and here is the Adam with lr=1e-4. Note that, Adam with lr=1e-3 may not achieve convergence.

(5) Multi-GPU training

我们使用pytorch的DistributedDataParallel API来实现单机多卡训练,但不知道为啥特别慢,所以都是禁掉了多GPU的。如果你实在需要多GPU训练,可能得自己修改一下代码,使用DataParallel API。

In this code repo, we use the DistributedDataParallel API of pytorch to achieve multi-GPU training, which is slow because of unknown reason. If you really need multi-gpu training, please modify the codes to use DataParallel API.

(6) Visualization during training


We also provide a visualization tool to give an intuition about the model performance during training. The results (including the heatmap, keypoints and raw matches) will be saved in the checkpoint path. The visualization results includes the scoremap of keypoints (meaningless for description net training), the keypoints (sift for description net training) and matches (we color the match line with epipolar constraint).

(7) Some dependencies

其他的依赖库不做赘述,path这个包因为有很多重名的所以单独列出来,请根据readme on github或者introduction on PyPI去安装path包。

We depend on the path package to manage the paths in this repo, please follow the readme on github or introduction on PyPI to install it. Users may be familiar with other dependencies, you can simply use pip and conda to install dependencies.


(1) Feature extraction

使用extract.py就可以提取PoSFeat特征,这个文件依赖于managers/extractor.py,使用者需要提供一个.yaml的配置文件,文件中需要包含datapath和detector config。输出的特征可以用.npz.h5两种格式保存。

如果配置文件里use_sift: True,那么输出的关键点会使用sift而不是学习的关键点。这里的sift使用的是OpenCV的默认设置,提取过程在dataloader里面完成,直接包含在了inputs字典里。

Using the extract.py can extract PoSFeat features. This file works with the managers/extractor.py, and users should provide a config file containing the datapath, detector config. The output can be .npz or .h5.

With use_sift: True in the config file, the output would be the sift keypoint with PoSFeat descriptor. The SIFT keypoints are detected with the OpenCV default settings in the dataloader.

(2) HPatches


We follow the evalutaion protocal proposed by D2-Net (please follow the introduction in D2-Net to download and modify the dataset), and modify the input codes for convenience. The result will be saved in evaluations/hpatches/cache as a .npy file, and we provide the results of several methods in the cache folder. Note that, you should mannuly remove the high resolution scenes in the original dataset.

Run the command

python extract.py --config ./configs/extract_hpatches.yaml


Then turn to the evaluations/hpatches folder, modify the path in the evaluation script (if you donnot modify the script, there is only a PoSFeat_CVPR cache result) and run the script

cd ./evaluations/hpatches
python evaluation.py


When finishing the evaluation, you will get pictures of curves and a .txt file containing the quantitative results in the evaluations/hpatches folder.

(3) Aachen-Day-Night

这部分测试完全按照The Visual Localization Benchmark中standard Local feature challenge的pipeline来进行,因此按照pipeline的介绍,先下载数据集,然后按照以下的结构组织数据集

We follow the standard Local feature challenge pipeline of The Visual Localization Benchmark, please follow the introductions to download the dataset, then manage the data in this way

├── 3D-models/
│  ├── aachen_v_1/
│  │  ├── aachen_cvpr2018_db.nvm
│  │  └── database_intrinsics.txt
│  └── aachen_v_1_1/
│     ├── aachen_v_1_1.nvm
│     ├── cameras.bin
│     ├── database_intrinsics_v1_1.txt
│     ├── images.bin
│     ├── points3D.bin
│     └── project.ini
├── images # the v1 data and v1.1 data are mixed in this folder
│  └── images_upright/
│     ├── db/
│     ├── queries/
│     └── sequences/
├── queries/
│  ├── day_time_queries_with_intrinsics.txt
│  ├── night_time_queries_with_intrinsics.txt
│  └── night_time_queries_with_intrinsics_v1_1.txt
└── others/
   ├── database.db
   ├── database_v1_1.db
   ├── image_pairs_to_match.txt
   └── image_pairs_to_match_v1_1.txt

如果不想按照上述结构组织数据集,那么你需要手动的修改一下数据路径的设置(evauluations/aachen/reconstruct_pipeline.py (Line 329-339) and evauluations/aachen/reconstruct_pipeline_v1_1.py (Line 319-330))

If you do not want to manage the data, you should mannuly modify the datapath settings in evauluations/aachen/reconstruct_pipeline.py (Line 329-339) and evauluations/aachen/reconstruct_pipeline_v1_1.py (Line 319-330).


Before evaluation, we should extract the features first,

python extract.py --config ./configs/extract_aachen.yaml


For evaulation on aachen-v1, run the command

cd ./evaluations/aachen
python reconstruct_pipeline.py --dataset_path [YOUR_data_path_root_aachen] \
--feature_path ../../ckpts/aachen/PoSFeat_mytrain/desc \
--colmap_path [YOUR_PATH_TO_COLMAP] \
--method_name PoSFeat_mytrain \
--match_list_path image_pairs_to_match.txt


For evaulation on aachen-v1.1, run the command

cd ./evaluations/aachen
python reconstruct_pipeline_v1_1.py --dataset_path [YOUR_data_path_root_aachen] \
--feature_path ../../ckpts/aachen/PoSFeat_mytrain/desc \
--colmap_path [YOUR_PATH_TO_COLMAP] \
--method_name PoSFeat_mytrain \
--match_list_path image_pairs_to_match_v1_1.txt


After evaluation, there will be 2 more folders created, intermedia contains intermediate results (such as sparse model and database) and results contains the .txt files that can be upload to the benchmark.

Note that, because the pose estimation (image registration) is based on the results of reconstruction, the results may be different each time.

(4) ETH local feature benchmark

按照ETH local feature benchmark (download instruction)中的介绍下载数据集。数据集需要按照下列方式组织

Download the dataset following the introduction in ETH local feature benchmark (download instruction). Manage the dataset in this way

├── Alamo/
│  ├── images/
│  │  └── ...
│  └── database.db
├── ArtsQuad_dataset/
│  ├── images/
│  │  └── ...
│  └── database.db
├── Fountain/
│  ├── images/
│  │  └── ...
│  └── database.db
└── ...


Extract features first, we extract features for different scenes individually (mannuly modify the subfolder)

python extract.py --config ./configs/extract_ETH.yaml


Then run evaluation for the scene

cd ./evaluations/ETH_local_feature
python reconstruction_pipeline.py --config ../../configs/extract_ETH.yaml


If you use this code in your project, please cite the following paper

    title={Decoupling Makes Weakly Supervised Local Feature Better},
    author={Li, Kunhong and Wang, Longguang and Liu, Li and Ran, Qing and Xu, Kai and Guo, Yulan},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {15838-15848}