[Feature Request] Add XFeat Learned Keypoint Detector & Descriptor
XFeat: Accelerated Features for Lightweight Image Matching TL;DR: Really fast learned keypoint detector and descriptor. Supports sparse and semi-dense matching.
Deep learning based feature detectors are usually more robust but are heavy using CPU. I got 1 FPS using the SuperPoint demo but 5 FPS with the XFeat demo (not much ik but still) using an Intel Core i7 Gen 8 laptop. It's a nice alternative to SuperPoint and I'm curious how well it would integrate into RTAB-Map.
Also with LightGlue learned matcher would be nice too.
With right python interface, they could be probably be used in rtabmap without changing c++ code. See examples: https://github.com/introlab/rtabmap/tree/master/corelib/src/python
Related posts:
- https://github.com/introlab/rtabmap/issues/1129
- https://github.com/introlab/rtabmap/issues/1123
- https://github.com/introlab/rtabmap/issues/1148
Hello my friend, i run the RTABMAP's example SuperPoint in WSL2-Ubuntu-22.04 but failed. I dropped the example in the root folder of SuperPoint git: and set --Vis/FeatureType 15 --PyDetector/Path "path/to/SuperPointPretrainedNetwork/rtabmap_superpoint.py" --PyDetector/Model "path/to/SuperPointPretrainedNetwork/superpoint_v1.pth". I run the RTABMAP by:
git clone https://github.com/introlab/rtabmap.git
cd rtabmap
git checkout tags/0.21.10-humble
cd build
conda activate xfeatpy311
cmake -DWITH_PYTHON=ON \
-DPython3_EXECUTABLE="$CONDA_PREFIX/bin/python" \
-DPYTHON_LIBRARY="$CONDA_PREFIX/lib/libpython3.11.so.1.0" \
-Dpybind11_DIR="$CONDA_PREFIX/lib/python3.11/site-packages/pybind11/share/cmake/pybind11" \
..
make -j4
sudo make install
export LIBGL_ALWAYS_SOFTWARE=1
export LD_LIBRARY_PATH="$CONDA_PREFIX/lib:$LD_LIBRARY_PATH"
sudo ldconfig
rtabmap
then I ran the example SuperPoint and found that it can not changed Feature2D into PyDetector, it is GFTT+ORB. No SuperPoint python init().
If i set --Kp/DetectorStrategy 15 i can see:
SuperPoint python init()
Segmentation fault (core dumped)
Nearly every torch.nn like
test = nn.Sequential(nn.AvgPool2d(4, stride = 4), nn.Conv2d (1, 24, 1, stride = 1, padding=0))
would caused the crash happen.
I can run the python script in pyCharm , VScode and gbd $CONDA_PREFIX/bin/python normally.
It seemed that the pyDetector C/C++ could not driver the python script correctly.
Additional content, here is my env:
# packages in environment at /home/nan_cheng/miniconda3/envs/xfeatpy311:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
bzip2 1.0.8 h5eee18b_6
ca-certificates 2025.2.25 h06a4308_0
filelock 3.13.1 pypi_0 pypi
fsspec 2024.6.1 pypi_0 pypi
jinja2 3.1.4 pypi_0 pypi
kornia 0.8.0 pypi_0 pypi
kornia-rs 0.1.8 pypi_0 pypi
ld_impl_linux-64 2.40 h12ee557_0
libffi 3.4.4 h6a678d5_1
libgcc 14.2.0 h767d61c_2 conda-forge
libgcc-ng 14.2.0 h69a702a_2 conda-forge
libgomp 14.2.0 h767d61c_2 conda-forge
libstdcxx 14.2.0 h8f9b012_2 conda-forge
libstdcxx-ng 14.2.0 h4852527_2 conda-forge
libuuid 1.41.5 h5eee18b_0
markupsafe 2.1.5 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
ncurses 6.4 h6a678d5_0
networkx 3.3 pypi_0 pypi
numpy 2.1.2 pypi_0 pypi
nvidia-cublas-cu12 12.6.4.1 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.6.80 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.6.77 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.6.77 pypi_0 pypi
nvidia-cudnn-cu12 9.5.1.17 pypi_0 pypi
nvidia-cufft-cu12 11.3.0.4 pypi_0 pypi
nvidia-curand-cu12 10.3.7.77 pypi_0 pypi
nvidia-cusolver-cu12 11.7.1.2 pypi_0 pypi
nvidia-cusparse-cu12 12.5.4.2 pypi_0 pypi
nvidia-cusparselt-cu12 0.6.3 pypi_0 pypi
nvidia-nccl-cu12 2.21.5 pypi_0 pypi
nvidia-nvjitlink-cu12 12.6.85 pypi_0 pypi
nvidia-nvtx-cu12 12.6.77 pypi_0 pypi
opencv-python 4.11.0.86 pypi_0 pypi
openssl 3.4.1 h7b32b05_0 conda-forge
packaging 25.0 pypi_0 pypi
pillow 11.0.0 pypi_0 pypi
pip 25.0 py311h06a4308_0
pybind11 2.13.6 pypi_0 pypi
python 3.11.11 he870216_0
readline 8.2 h5eee18b_0
setuptools 75.8.0 py311h06a4308_0
sqlite 3.45.3 h5eee18b_0
sympy 1.13.1 pypi_0 pypi
tk 8.6.14 h39e8969_0
torch 2.6.0+cu126 pypi_0 pypi
torchaudio 2.6.0+cu126 pypi_0 pypi
torchvision 0.21.0+cu126 pypi_0 pypi
triton 3.2.0 pypi_0 pypi
typing-extensions 4.12.2 pypi_0 pypi
tzdata 2025a h04d1e81_0
wheel 0.45.1 py311h06a4308_0
xz 5.6.4 h5eee18b_1
zlib 1.2.13 h5eee18b_1
torch.cuda:True
torch.version:2.6.0+cu126
torch.version.cuda:12.6
torch.cudnn:90501
pybind11.get_include():/home/nan_cheng/miniconda3/envs/xfeatpy311/lib/python3.11/site-packages/pybind11/include
I would appreciate it if you could help me solve this problem!
I updated the comment in the file with --Kp/DetectionStrategy 15.
Could it be because the model path is relative here? https://github.com/introlab/rtabmap/blob/f03139d4ffd88d7c0d61977ff07ab129a7242c4d/corelib/src/python/rtabmap_superpoint.py#L32
On my computer, I just realized that PyDetector/Model doesn't exist and in my local file the absolute path of the pth file is hard-coded in the python script:
superpoint = SuperPointFrontend(weights_path="/home/mathieu/workspace/SuperPointPretrainedNetwork/superpoint_v1.pth",
nms_dist=4,
conf_thresh=0.015,
nn_thresh=1,
cuda=cuda)
Another test you can do is to call the python script directly to make sure it works:
$ cd ~
$ python3 path/to/SuperPointPretrainedNetwork/rtabmap_superpoint.py
You can also uncomment those lines to see in which environment the script is running (when called from c++). https://github.com/introlab/rtabmap/blob/f03139d4ffd88d7c0d61977ff07ab129a7242c4d/corelib/src/python/rtabmap_superpoint.py#L12-L15
Example of expected output:
$ rtabmap-reprocess \
--Mem/UseOdomFeatures false \
--Kp/DetectorStrategy 15 \
--PyDetector/Path "~/workspace/SuperPointPretrainedNetwork/rtabmap_superpoint.py" \
rtabmap.db output.db
Reprocessing data of "rtabmap.db"...
High variance detected, triggering a new map...
SuperPoint python init()
SuperPoint python detect()
Processed 1/397 nodes [id=1 map=0 opt_graph=1]... 1071ms
SuperPoint python detect()
Processed 2/397 nodes [id=2 map=0 opt_graph=1]... 29ms
SuperPoint python detect()
Processed 3/397 nodes [id=3 map=0 opt_graph=2]... 26ms
SuperPoint python detect()
Processed 4/397 nodes [id=4 map=0 opt_graph=3]... 24ms
[...]
First call to SuperPoint is longer because it has to initialize the network.
You may have misunderstood me. My content is not about the path problem of model weights, but about the segmentation error problem when initializing the torch module with the C/C++ language calling the python script. This is my code:
import torch
import numpy as np
from xfeat_model.xfeat import XFeat
import sys
import os
import torch.nn as nn
import pybind11
torch.set_grad_enabled(False)
xfeat_net = []
feature_record = {}
def init(cuda):
print(os.sys.path)
print(sys.version)
print("xfeat python init()")
print(f"torch.cuda:{torch.cuda.is_available()}")
print(f"torch.version:{torch.__version__}")
print(f"torch.version.cuda:{torch.version.cuda}")
print(f"torch.cudnn:{torch.backends.cudnn.version()}")
print(f"pybind11.get_include():{pybind11.get_include()}")
print("test...")
test = nn.Sequential(nn.AvgPool2d(4, stride = 4), nn.Conv2d (1, 24, 1, stride = 1, padding=0))
print("test success")
# This class runs the xfeat network and processes its outputs.
global xfeat_net
xfeat_net = XFeat(cuda)
def detect(imageBuffer):
print("xfeat python detect()")
image = np.asarray(imageBuffer)
image = (imageBuffer.astype('float32') / 255.)
global xfeat_net
pts, desc = xfeat_net.detectAndCompute(image)
# return float: Kpts:Nx3, Desc:NxDim
# use copy to make sure memory is correctly re-ordered
pts = np.float32(pts.cpu()).copy()
desc = np.float32(desc.cpu()).copy()
return pts, desc
if __name__ == '__main__':
init(True)
pts, desc = detect(np.random.rand(640, 480)*255)
print(pts.shape)
print(desc.shape)
I can run it fine from anaconda, but from C/C++ it fails with:
['', '/opt/ros/humble/lib/python3.10/site-packages', '/opt/ros/humble/local/lib/python3.10/dist-packages', '/home/nan_cheng/miniconda3/envs/xfeatpy311/lib/python311.zip', '/home/nan_cheng/miniconda3/envs/xfeatpy311/lib/python3.11', '/home/nan_cheng/miniconda3/envs/xfeatpy311/lib/python3.11/lib-dynload', '/home/nan_cheng/miniconda3/envs/xfeatpy311/lib/python3.11/site-packages', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat', '/home/nan_cheng/Desktop/rtabmap/xfeat']
3.11.11 (main, Dec 11 2024, 16:42:57) [GCC 11.2.0]
xfeat python init()
......
[ WARN] (2025-06-06 19:05:12.196) util3d.cpp:488::cloudFromDepthRGB() Decimation (4) is not valid for current image size (depth=240x135). Highest compatible decimation used=3.
[ WARN] (2025-06-06 19:05:12.224) SensorCaptureThread.cpp:391::mainLoop() Could not capture image!
[ WARN] (2025-06-06 19:05:12.224) SensorCaptureThread.cpp:542::mainLoop() no more data...
[ WARN] (2025-06-06 19:05:12.231) util3d.cpp:488::cloudFromDepthRGB() Decimation (4) is not valid for current image size (depth=240x135). Highest compatible decimation used=3.
torch.cuda:True
torch.version:2.6.0+cu126
torch.version.cuda:12.6
torch.cudnn:90501
pybind11.get_include():/home/nan_cheng/miniconda3/envs/xfeatpy311/lib/python3.11/site-packages/pybind11/include
test...
Segmentation fault (core dumped)
We had issues similar to this in the past, but they were fixed by using pybind11. I never tried to build and test rtabmap using a virtual python environment with different version of python libraries than the system ones. I see that /opt/ros/humble/local/lib/python3.10/dist-packages is in your python path, don't know how much it would make things fail with your python3.11 libraries.