RTabMap Not Getting Loop Closures when used with a Moving Camera on Moving Robot
Hello,
(Posting this here instead of the forum as I saw your post about it getting spammed recently and that this was a reasonable alternative.) I'm using RTAB-Map in a rather odd way and have been struggling with one issue for a while that I was hoping you might be able to shed some light on.
I have a Zed 2 stereo camera mounted on a pan-tilt unit on a mobile robot. The pan-tilt unit is made up of two servos which allow me to control the camera's pan and tilt relative to the robot. (Basically, the camera can "look around".) For my experiments, I first map out my area and save the map. I then re-use this map for each trial with RTAB-Map in localization mode. I'm using ROS Melodic (I know it's very old) with RTAB-Map 0.20.23 (installed from APT, not custom built). When I create my map I start in a position and have the camera scan the area (both pan and tilt). I then move forward by about a metre and scan again. I repeat this until my entire area has been mapped. When I then try to run a trial, RTAB-Map seems unable to make any loop closures even though I can see on its UI that the closest image from the database is essentially the same as the current camera view.
Important Notes:
- I have a ground truth location system set up which I use to provide odometry while mapping. During trials odometry comes from wheel encoders. I have tried also using wheel odometry for mapping (ie. Removing the ground truth system from the equation) and the behaviour still persists. I only mention it here because the map databases I have linked at the end were made with this ground truth system odometry.
- When I start a trial I provide the initial position of the robot to RTAB-Map by publishing to
/rtabmap/initialposefor 2 seconds after RTAB-Map has started publishing /odom -> /map. If I start publishing before that, RTAB-Map fails to start. I have tried the system both with and without publishing the initial position. Both situations have the same behaviour described above, so initial position publish shouldn’t be causing the issue but I wanted to mention it anyway. - I have modified both the pan and tilt servos so that I can read their positions directly from the internal potentiometer and provide accurate position transforms for the stereo camera at all times.
- I have found that if I use a different method for mapping, then RTAB-Map is able to perform loop closures with no issues. I map out the area with the camera in one position, then move it to the next position and drive the path again, and repeat until all camera positions have been mapped. (This is all in one RTAB-Map session, so no multi-session mapping involved here.) Unfortunately, this would require any area to essentially be mapped many times and is much slower than the desired mapping method. It is thus not a viable mapping method moving forward and was just something I tried while debugging.
- I have the same system set up to run on a simulator (Webots) and have no issues. RTAB-Map is able to perform the expected loop closures using a map created of the simulated world using the desired mapping technique. Note that the simulation is run on a different (and more powerful) computer than the one on the robot.
- Through investigation I have also discovered that when using a map created using the desired mapping technique, if I start RTAB-Map when the robot is in the same position it was in when it started the map, then RTAB-Map will be able to make loop closures with no issues.
Example of What RTAB-Map Prints When it is Unable to Perform Loop Closures:
[ INFO] [1760661329.205858351]: rtabmap (515): Rate=1.00s, Limit=0.000s, Conversion=0.0017s, RTAB-Map=0.2557s, Maps update=0.0049s pub=0.0000s (local map=349, WM=349)
[ WARN] (2025-10-16 20:35:30.156) Rtabmap.cpp:3175::process() Rejecting localization (516 <-> 221) in this iteration because a wrong loop closure has been detected after graph optimization, resulting in a maximum graph error ratio of 38.035286 (edge 443->221, type=1, abs error=150.029529 deg, stddev=0.068844). The maximum error ratio parameter "RGBD/OptimizeMaxError" is 10.000000 of std deviation.
[ WARN] (2025-10-16 20:35:30.160) Rtabmap.cpp:4263::process() Republishing data of requested node(s) 122 326 (Rtabmap/MaxRepublished=2)
[ INFO] [1760661330.167064387]: rtabmap (516): Rate=1.00s, Limit=0.000s, Conversion=0.0019s, RTAB-Map=0.2322s, Maps update=0.0061s pub=0.0000s (local map=349, WM=349)
[ WARN] (2025-10-16 20:35:31.186) Rtabmap.cpp:3175::process() Rejecting localization (517 <-> 221) in this iteration because a wrong loop closure has been detected after graph optimization, resulting in a maximum graph error ratio of 32.143368 (edge 443->221, type=1, abs error=126.788963 deg, stddev=0.068844). The maximum error ratio parameter "RGBD/OptimizeMaxError" is 10.000000 of std deviation.
[ WARN] (2025-10-16 20:35:31.191) Rtabmap.cpp:4263::process() Republishing data of requested node(s) 121 327 (Rtabmap/MaxRepublished=2)
[ INFO] [1760661331.198506024]: rtabmap (517): Rate=1.00s, Limit=0.000s, Conversion=0.0007s, RTAB-Map=0.2619s, Maps update=0.0068s pub=0.0000s (local map=349, WM=349)
[ WARN] (2025-10-16 20:35:32.292) Rtabmap.cpp:3175::process() Rejecting localization (518 <-> 221) in this iteration because a wrong loop closure has been detected after graph optimization, resulting in a maximum graph error ratio of 37.279560 (edge 443->221, type=1, abs error=147.048589 deg, stddev=0.068844). The maximum error ratio parameter "RGBD/OptimizeMaxError" is 10.000000 of std deviation.
[ WARN] (2025-10-16 20:35:32.296) Rtabmap.cpp:4263::process() Republishing data of requested node(s) 120 328 (Rtabmap/MaxRepublished=2)
[ INFO] [1760661332.302206477]: rtabmap (518): Rate=1.00s, Limit=0.000s, Conversion=0.0025s, RTAB-Map=0.2078s, Maps update=0.0052s pub=0.0000s (local map=349, WM=349)
[ WARN] (2025-10-16 20:35:33.345) Rtabmap.cpp:3175::process() Rejecting localization (519 <-> 224) in this iteration because a wrong loop closure has been detected after graph optimization, resulting in a maximum graph error ratio of 43.477444 (edge 519->224, type=1, abs error=178.721734 deg, stddev=0.071745). The maximum error ratio parameter "RGBD/OptimizeMaxError" is 10.000000 of std deviation.
[ WARN] (2025-10-16 20:35:33.349) Rtabmap.cpp:4263::process() Republishing data of requested node(s) 119 329 (Rtabmap/MaxRepublished=2)
Screenshot of UI When Unable to Perform Loop Closures:
(Note that the images it is comparing are very similar)
Launch file: This launch file is used in conjunction with several others to start the system.
<launch>
<!--Based on rtabmap_ros rtabmap.launch and zed_ros_wrapper zed_no_tf.launch
and http://wiki.ros.org/rtabmap_ros/Tutorials/HandHeldMapping
and http://wiki.ros.org/rtabmap_ros/TutorialsOldInterface/SetupOnYourRobot -->
<arg name="open_viz" default="false"/>
<arg name="localize" default="true"/> <!-- Set to true to put RTAB-Map into localization mode. -->
<arg name="is_continued_map" default="false"/> <!--set to true with localize=false to expand on an existing map-->
<arg name="is_sim" default="false"/><!-- Set to true when running with simulator. -->
<arg name="filepath" default="rtabmap.db"/>
<arg if="$(eval localize == true or (localize == false and is_continued_map == true))" name="rtabmap_args" default=""/>
<arg if="$(eval localize == false and is_continued_map == false)" name="rtabmap_args" default="--delete_db_on_start"/>
<arg name="odom_topic" default="/mcu/wheel_odometry"/>
<group ns="rtabmap">
<node pkg="nodelet" type="nodelet" name="rgbd_sync" args="standalone rtabmap_ros/rgbd_sync" output="screen">
<remap from="rgb/image" to="/zed_node/rgb/image_rect_color"/>
<remap from="depth/image" to="/zed_node/depth/depth_registered"/>
<remap from="rgb/camera_info" to="/zed_node/rgb/camera_info"/>
<remap from="rgbd_image" to="rgbd_image"/> <!-- output -->
<!-- Should be true for not synchronized camera topics
(e.g., false for kinectv2, zed, realsense, true for xtion, kinect360)-->
<param name="approx_sync" value="false"/>
</node>
<node name="rtabmap" pkg="rtabmap_ros" type="rtabmap" output="screen" args="$(arg rtabmap_args)">
<param name="frame_id" type="string" value="base_cam"/>
<param name="subscribe_depth" type="bool" value="false"/>
<param name="subscribe_rgbd" type="bool" value="true"/>
<remap from="odom" to="$(arg odom_topic)"/>
<remap from="rgbd_image" to="rgbd_image"/>
<param name="queue_size" type="int" value="10"/>
<!-- <param name="wait_imu_to_init" type="bool" value="true"/>
<param name="imu_topic" type="string" value="/zed_node/imu/data"/> -->
<!-- RTAB-Map's parameters -->
<param name="RGBD/AngularUpdate" type="string" value="0.01"/>
<param name="RGBD/LinearUpdate" type="string" value="0.01"/>
<param name="RGBD/OptimizeFromGraphEnd" type="string" value="false"/>
<!--0 is infinity. this is the max range from the sensor-->
<param if="$(arg localize)" name="Grid/RangeMax" value="0.0"/>
<param name="Mem/STMSize" value="30"/>
<param if="$(eval localize == true and is_sim == true)" name="RGBD/OptimizeMaxError" type="double" value="1.0" />
<param if="$(eval localize == true and is_sim == false)" name="RGBD/OptimizeMaxError" type="double" value="10.0" />
<param unless="$(arg localize)" name="RGBD/OptimizeMaxError" type="double" value="0.1" />
<param name="Grid/MinClusterSize" type="int" value="20"/>
<!--Points above this height aren't included in the projection-->
<param unless="$(arg is_sim)" name="Grid/MaxObstacleHeight" type="double" value="1.0"/>
<param if="$(arg is_sim)" name="Grid/MaxObstacleHeight" type="double" value="2.4"/>
<!-- Cache more odom measurements to try to limit the large jumps from incorrect loop closures. -->
<param name="RGBD/MaxOdomCacheSize" type="string" value="40"/>
<!-- foot print doesn't work so we use this to keep from detecting the wheelchair. -->
<param name="Grid/RangeMin" type="double" value="0.5"/>
<!--Points below this height are ignored-->
<param unless="$(arg is_sim)" name="Grid/MaxGroundHeight" type="double" value="0.1"/>
<param if="$(arg is_sim)" name="Grid/MaxGroundHeight" type="double" value="0.0"/>
<param name="database_path" type="string" value="$(arg filepath)"/>
<!--if localize is true, we want this false. Otherwise it should be true-->
<param name="Mem/IncrementalMemory" type="string" value="$(eval localize == false)"/>
<param unless="$(arg is_sim)" name="RGBD/SavedLocalizationIgnored" type="string" value="true"/>
</node>
<!-- Publishing the Position Covariance for Uncertainty Measure -->
<node name="rtabmap_pose_uncertainty_node" pkg="mapping_and_localization" type="rtabmap_pose_uncertainty_node.py">
</node>
<!-- Visualisation RTAB-Map -->
<node if="$(arg open_viz)" pkg="rtabmap_ros" type="rtabmapviz" name="rtabmapviz" args="-d ~/.ros/rtabmap_gui.ini" clear_params="true" output="screen">
<param name="subscribe_depth" type="bool" value="true"/>
<param name="subscribe_rgb" type="bool" value="false"/>
<param name="subscribe_rgbd" type="bool" value="true"/>
<param name="subscribe_stereo" type="bool" value="false"/>
<param name="subscribe_scan" type="bool" value="false"/>
<param name="subscribe_scan_cloud" type="bool" value="false"/>
<param name="subscribe_scan_descriptor" type="bool" value="false"/>
<!-- <param name="subscribe_odom_info" type="bool" value="true"/> -->
<param name="frame_id" type="string" value="base_cam"/>
<param name="odom_frame_id" type="string" value="odom"/>
<param name="wait_for_transform_duration" type="double" value="0.2"/>
<param name="queue_size" type="int" value="10"/>
<param name="approx_sync" type="bool" value="false"/>
<remap from="rgb/image" to="/zed_node/rgb/image_rect_color"/>
<remap from="depth/image" to="/zed_node/depth/depth_registered"/>
<remap from="rgb/camera_info" to="/zed_node/rgb/camera_info"/>
<remap from="rgbd_image" to="rgbd_image"/>
<remap from="odom" to="/mcu/wheel_odometry"/>
</node>
</group>
</launch>
Database files:
Thank you!
The loop closures are rejected because there is an abs error=178.721734 deg. That could have been caused by this issue: https://github.com/introlab/rtabmap/pull/1511
You should be able to build latest rtabmap/rtabmap_ros on ubuntu 18.04/melodic. Use master branch on both repos. For rtabmap, you will need to install a more recent cmake version (e.g., 3.31.9). For your launch file, the package name of these nodes have changed, see this table for the mapping between melodic versus latest version.
Some comments:
-
Instead of using the camera frame as base frame, you could use the base frame of the robot instead. If the camera is on a pan/tilt, you could publish on TF the transform between the base of the pan/tilt to the camera link (based on the servo values). With
RGBD/LinearUpdate=0and theframe_idset to base frame of your robot (e.g.,base_linkorbase_footprint), rtabmap would know the position of the camera relative to the base frame using TF. That should also make it easier to use wheel odometry if it gives already the pose of the base frame of the robot. -
How fast is the pan/tilt rotating? I'll suggest to add an intermediate node between the camera node and rtabmap node to republish the images only when the pan/tilt is not rotating to avoid blurry images or create very strange effects like this:
-
I looked at the "ground truth" database, but some consecutive images look significantly off. For example, these two images were taken just 2 seconds apart:
Thank you for replying so quickly!
I tried installing the newest version of rtabmap and rtabmap_ros from the master branch for both repos as you suggested. After un-installing the existing apt versions. The version is now reported as 0.23.2. It doesn't seem to have solved the issue. I also tried changing frame_id to my robot's base frame (which is base_link, btw) and setting RGBD/LinearUpdate=0. I already publish the camera transforms over the TF topic based on the servo positions. This also didn't seem to affect it. Finally, I did as you suggested and added an intermediate node between the zed publishing node and rtabmap which prevents images (both rgb and depth) and point clouds from being published while the camera is moving. In the resulting map, I only saw two images with blur caused by camera movement (checked using rtabmap-databaseViewer), however, the problem persists. (It's worth noting that there is still some blur caused by robot motion itself.)
Here is a recorded database with all three of your suggestions implemented together.
While playing around with this today, I noticed that I don't actually need to start at the exact same position as I was when I started mapping. I just need to be facing the same direction when I start RTAB-Map in localisation mode as I was when I started mapping for it to work. I guess this tracks with the abs error=178.721734 deg you noticed. In the error printout I originally gave you, I had started the wheelchair when I was pointing in nearly the exact opposite direction as I was when I started making the map.
EDIT: I've done a bit more playing with it and determined that this holds true for any starting place in the map. It also is more accepting of angle variations that I initially though. It will tolerate starting positions rotated 90 degrees clockwise (looking down on the robot from above) up to positions rotated about 135 degrees counter-clockwise (ie. the opposite direction). As long as the robot starts within that range, everything works fine.
To answer your question: How fast is the pan/tilt? I've found empirically that the speed is 209 deg/s on average but it can range from 168 deg/s to 245 deg/s. So fast, but it varies a decent amount.
It's odd/concerning that you found some consecutive images so badly off like that. I need to look into that still so I don't have an answer as to why that is happening just yet. May I ask what you used to find that? Somewhere in the databaseViewer? Just in the plain RTAB-Map UI? I admit I don't always know what I'm doing with both UIs so I might have missed it.
To compare consecutive point clouds, you can open the Constraints Viewer in rtabmap-databaseViewer, then scroll the 2 main sliders (in main layout) to show the clouds relative to them.
With the base frame set to base_link, it is easier to see from where the error could come from. Between these two pictures, the base frame didn't move, only the camera is looking down, maybe the pitch angle is wrong (coming from the pan/tilt TF). (I enabled the frustums with right-click on the view)
Looking at the new database, the RGBD/OptimizeMaxError parameter is set to 0.1, which is very low, so that good loop closures may be rejected because of that. I set it to between 2 and 4 in general.
So I opened the new map, changed RGBD/OptimizeMaxError to 3 and put it in localization mode. I've replayed the database of your previous post on it. It seems to detect fairly often loop closures:
Looking at the logs, I could not see large abs angle errors (>150 deg) for the rejected loop closures. They were rejected not so far from RGBD/OptimizeMaxError std ratio, as expected.
Ok, I was able to get the cloud viewing working in the Constraints viewer. (For some reason, I need to open the Graph View before it will load the point clouds in the Constraints Viewer. I'm not sure why. There's no errors or anything being printed out in the teminal that would give any indication as to the issue.) I can see what you mean about them not lining up. I'll investigate what's going on there.
I confirmed that if I replay the older database for the most recent database, I get lots of loop closures (when setting RGBD/OptimizeMaxError to 3). I expect this because both databases start with the robot pointing the same direction. I've found that localisation works with no issues as long as the robot starts pointed in the same direction as it was pointing when mapping started. The issues seem to arise when the robot starts while pointing the opposite (or nearly opposite) direction.
I've attached a database taken today where the robot starts recording when it is pointed in the opposite direction as it was pointing when the mapping started. I tried replaying it against the the previous database I uploaded (oct 17) and found that setting RGBD/OptimizeMaxError to 3 or even 10 doesn't really allow loop closures. I think I saw two when RGBD/OptimizeMaxError is 10 and none at all when it's 3. So this database should illustrate the issue I'm running into, I hope. I apologize for not uploading something like it sooner, I wasn't aware I could use at database as a source and replay it to test localisation. Note that this database was recorded during the day while the others have been recorded at night. I did test that if the wheelchair is pointed in the same direction as it was at the start of the nightime map that it was able to make loop closures. ie. It's not the daytime vs. nighttime difference that's causing the issue here.
Database for localisation tests.
While playing around with it I noticed that if I left RTAB-Map in mapping mode while replaying this database instead of flipping to localisation mode, it makes a lot more loop closures. (aka some instead of none) Is this expected behaviour?
I also have a question about RGBD/OptimizeMaxError. When I map out my area I've been setting it to 0.1. When I restart the system later for localisation, I set it to 10.0. The general idea I had being that I want to be more picky when mapping to keep the map from being distorted from real life too much. Based on your comment above, I take it you would not recommend this approach?
Update:
I've sorted out the issues with the pitch angle of the camera. ie. where the images weren't lining up when the tilt angles were different. The values reported by the servos are inaccurate enough to cause these issues. I've fixed this by grabbing the roll and pitch reported by the Zed's onboard IMU and published that in my TF messages where I was publishing tilt values. (Roll shouldn't change but I added anyway because I had it.) I'm unfortunately not able to also replace the pan values from the servos with the IMU's yaw value because that value also includes the robot's overall yaw change.
In addition to changing out the reported tilt values, I also redid the TF transforms published for the pan tilt unit. I had originally approximated it as one point transform where all the pan and tilt angles are, but this is not exactly correct. The construction of the unit means that the servo axes don't exactly align and one is on top of the other etc. So it's now constructed of multiple transforms but it properly mimics the position of all the servos now. (I mention this because you will notice that the base of the frustums with different pan and tilt values will not all be at the same point now. This is expected and how it should be.)
I've recorded new databases with the corrected tilt values. The issue still persists. If the robot starts trying to localize roughly 180 degrees rotated from where it started making the map, RTAB-Map is unable to make loop closures. If it starts facing the same direction it was when it started mapping, everything is fine.
Map database Database for localisation, ie. use this as a data source to see the localisation issue
Note that you will still need to change the value of RGBD/OptimizeMaxError for these databases. (I'm sorry. I forgot to change it before I recorded them.) I've tried both 3 and 10 and neither seemed to work.
Update 2:
Turns out if I change the optimizer from g2o to TORO, it works just fine and finds loop closures as I would expect. (ie. Optimizer/Strategy = "0" ) Do you have any idea why this might be?
I tried both 0 and 1 for g2o/Optimizer (ie. Levenberg and GaussNewton). For each of those, I also tried all the possible options (0, 1, 2, and 3, ie. csparse, pcg, cholmod, and Eigen) for g2o/Solver. None of these combinations seemed to work. I don't have RTAB-Map built with any of the other optimizers at the moment so I can't check them.
Thanks for the new databases, I can reproduce it here:
[ INFO] (2025-10-27 19:19:16) Rtabmap.cpp:3339::rtabmap::Rtabmap::process() Max optimization linear error = 0.074587 m (link 560->184, var=0.002241, ratio error/std=1.575762, thr=3.000000)
[ INFO] (2025-10-27 19:19:16) Rtabmap.cpp:3381::rtabmap::Rtabmap::process() Max optimization angular error = 167.934848 deg (link 562->184, var=0.005258, ratio error/std=40.420956, thr=3.000000)
[ WARN] (2025-10-27 19:19:16) Rtabmap.cpp:3398::rtabmap::Rtabmap::process() Rejecting localization (615 <-> 214) in this iteration because a wrong loop closure has been detected after graph optimization, resulting in a maximum graph error ratio of 40.420952 (edge 562->184, type=1, abs error=167.934848 deg, stddev=0.072512). The maximum error ratio parameter "RGBD/OptimizeMaxError" is 3.000000 of std deviation.
[ WARN] (2025-10-27 19:19:16) Rtabmap.cpp:3433::rtabmap::Rtabmap::process() Global and loop closures seem not tallying together, try again to optimize without local loop closures...
It works well with gtsam optimizer though, meaning there is maybe a bug in the g2o integration about this case. I would not recommend using TORO in general, g2o and gtsam should be better. I'll need more time to debug this.
I've found how to make it work with g2o, we have to increase RGBD/LocalizationPriorError to 0.1 (default is 0.001). That parameter has been added for some optimization issues with tags: https://github.com/introlab/rtabmap_ros/issues/1057. The real reason may be coming from the odometry covariance and/or bad local transform between base_link and the camera frame, but RGBD/LocalizationPriorError seems to be the current workaround to make g2o converge.
For your other questions above...
Apologize for the Constraints Viewer, to be able to compare nodes that are not linked by a loop closure or an odometry constraint, we should open the Graph View once to get optimized poses. That is indeed an hidden feature. Just created an issue about that: https://github.com/introlab/rtabmap/issues/1608
If more loop closures are added in mapping mode than in localization mode, it could be related on how the deformation check is validated. In mapping mode, the whole map can deform. In localization mode, only the current odometry path is allowed to deform, not the map. If the map's graph has some errors, re-localizations may be rejected. Note that increasing RGBD/LocalizationPriorError makes the map's graph less stiff and allows it to be more deformed.
You can be more picky in mapping mode by lowering RGBD/OptimizeMaxError, but 0.1 is too low. That parameter means how much error a link can deform, based on ratio of the std dev of the covariance. A value of 3 means: accept all deformations up to 3 times the std dev of the covariance. For example, if a link is 1 meter long and the variance is 0.0001 (std=1cm), a ratio of 3 would mean that we could accept that this link can be deformed up to 3 cm.
I've found how to make it work with g2o, we have to increase
RGBD/LocalizationPriorErrorto0.1(default is0.001). That parameter has been added for some optimization issues with tags: #1057. The real reason may be coming from the odometry covariance and/or bad local transform between base_link and the camera frame, butRGBD/LocalizationPriorErrorseems to be the current workaround to make g2o converge.
That doesn't seem to work for me when I run it on my computer with the Oct. 24th databases? When I change RGBD/LocalizationPriorError to 0.1 from 0.001, when the optimizer is g2o and RGBD/OptimizeMaxError is 3 or 10 (I tried both), it does not make loop closures.
Here's the exact steps I follow. I am I doing something different from you? (Note that I always "reset" my oct_24_zed_tilt.db file between tests by deleting it and replacing it with a copy of a clean backup I've kept, so I can be sure that nothing I've done previously has affected it.)
- Run rtabmap with the oct_24_zed_tilt.db file
- Select ok for both pop-ups that open on start up.
- Open Window > Preferences and switch to advanced.
- Go to Source, switch to database and set it to be oct_24_zed_tilt_loc.db (click yes for both pop ups).
- Go to RGB-D SLAM > Graph Optimization > change Optimize Max Error from 0.1 to 3.
- Go to RGB-D SLAM > change Localization Prior Error 0.001 to 0.1.
- Click Apply then Ok.
- Go to Detection and switch to Localisation from Mapping.
- Click blue start button.
It works well with gtsam optimizer though, meaning there is maybe a bug in the g2o integration about this case. I would not recommend using TORO in general, g2o and gtsam should be better. I'll need more time to debug this.
Ok, I will try installing gtsam in a moment and try it out. It's not too important to me which optimizer I use.
Apologize for the Constraints Viewer, to be able to compare nodes that are not linked by a loop closure or an odometry constraint, we should open the Graph View once to get optimized poses. That is indeed an hidden feature. Just created an issue about that: https://github.com/introlab/rtabmap/issues/1608
Hidden features are the best features! Thanks for confirming that it's not just me then. I was worried I was doing something wrong or had managed to somehow break something on my system.
Also, thank you for explaining RGBD/OptimizeMaxError a bit more in-depth. It makes sense with 0.1 is way too low. I'll increase it for mapping.
On the latest windows release, I could reproduce the error even with RGBD/LocalizationPriorError=0.1. However, I just tried again with a fresh download and exact steps on Ubuntu (on which I was debugging yesterday) and it localized on the first frame. That could be caused by a different version of g2o used. On windows release, g2o version is 2024.12.14 and on ubuntu I am using the g2o ros2 humble version 2020.5.29(ros-humble-libg2o).
On your system, which g2o version are you using? On ROS melodic, the g2o binaries would be 2018.3.25 (ros-melodic-libg2o). It seems that ROS2 humble/jazzy/kilted are all providing 2020.5.29 version, which could be why we didn't see that kind of error before.
I won't be able to check until Monday unfortunately, but I'm working on Ubuntu 18.04 with ROS Melodic installed using the instructions on the ROS website. I haven't made any modifications or done anything weird to the setup, so it should have the default g2o binaries for ROS Melodic.
I will check the exact version on Monday though.
Interesting that the issue disappeared for one version and has reappeared in a more recent version of the library. I'm not sure what to make of that.
Ok, so it looks like I do indeed have 2018.3.25 installed via ros-melodic-libg2o.
Output of dpkg -s ros-melodic-libg2o on my robot's computer: (if it helps)
Package: ros-melodic-libg2o
Status: install ok installed
Priority: extra
Section: misc
Installed-Size: 6299
Maintainer: Vincent Rabaud <[email protected]>
Architecture: amd64
Version: 2018.3.25-0bionic.20201015.025147
Depends: libc6 (>= 2.14), libcamd2 (>= 1:4.5.2), libcholmod3 (>= 1:4.5.2), libcxsparse3 (>= 1:4.5.2), libgcc1 (>= 1:3.0), libgl1, libglu1-mesa | libglu1, libstdc++6 (>= 5.2), libboost-all-dev, libeigen3-dev, libgl1-mesa-dev, libglu1-mesa-dev, libsuitesparse-dev, ros-melodic-catkin
Description: The libg2o library from http://openslam.org/g2o.html
Homepage: https://github.com/RainerKuemmerle/g2o
I refactored two things to work with old and new g2o for your specific case. Basically with that PR https://github.com/introlab/rtabmap/pull/1610, you will be able to set RGBD/LocalizationPriorError to 0. That will do the old behavior for localization optimization, which works with your data.
Still need to doublecheck a couple of things on that PR before merging to master, but you could probably use it now.
Sorry for the delay, I just tried the changes. I see they were merged into master so I pulled from master, rebuilt and reinstalled. (Following these instructions.) I double checked that the line in corelib/include/rtabmap/core/parameters.h matched what I saw changed in the merge request as a way to verify that I have the correct version of the code.
I re-ran the test with the Oct. 24th databases following the steps I listed earlier with the only difference being that I changed RGBD/LocalizationPriorError to 0. I only got one loop closure at the beginning of the database replaying and then no more after that. I tried with both 3 and 10 as RGBD/OptimizeMaxError just to cover my bases. Did you do something different when testing? Is this once again a g2o version thing?
Also, what do you mean by "the old behaviour for localization optimization"?
I reverted the changes allowing RGBD/LocalizationPriorError to be zero. After thinking about it, it doesn't make sense to remove the priors. The "old behavior" was about optimizing map's nodes with localization's nodes, however just removing the priors didn't put back the old behavior because we don't add back the map's links to the optimization. So setting to zero was doing an undefined behavior.
I tried on melodic, here are my findings. To find loop closures on most of the frames while keeping optimization verification, we have to do set these parameters:
Here a video of the expected results:
https://github.com/user-attachments/assets/7ed3abbb-587b-4db8-ba7a-6d768d01e058
It is easier to see the actual problem. Either the map's odometry or the localization's odometry is bad, that makes the localization graph jumping a lot.
To remove the optimization verification out of the way and just check the actual localization transforms to see if they look good, we can set:
Here the localization result:
https://github.com/user-attachments/assets/bf4741cc-2c24-4639-8c6d-7051d724dcbd
Again, the robot is jumping a lot. I'll suggest to compare localization performance when the pan/tilt is not moving and when it does. I suspect that the transform between base link and camera link is not always perfect, causing these huge jumps. In summary, the optimization verification was doing its job all the time, rejecting these big jumps.
I reverted the changes allowing RGBD/LocalizationPriorError to be zero. After thinking about it, it doesn't make sense to remove the priors.
Very understandable.
The good news is that I'm able to replicate the behaviour you found above. Thank you for finding that. I'll go back to my odometry and see what I can find that could be causing the jumping.
So I've been looking at my odometry trying to find the issue. I tried switching out the ground truth based odometry with the odometry from the wheel encoders. The ground truth based odometry just focuses on getting the best position at that particular time so it can jump around a bit. The wheel encoders are smooth but are prone to drift over time. (Pretty much what you would expect for wheel encoders.)
I also tried not moving the camera relative the wheelchair while mapping or localising. The camera is stuck forward and I hardcoded the pan and tilt angles to eliminate any potential issues from the servo readings. This should, in theory, eliminate any issues with bad transforms between base_link and the camera because they are now all static. The issue of not being able to localise when starting backwards from the map start persists.
Now, however, when I follow your steps above which showed the localisation jumping all over the map, it no longer seems to do this in this example. There's a couple jumps but they occur when the system is able to make a loop closure after being unable to for a bit. So they seem more like corrections for drift rather than actual odometry jumps, if I understand correctly.
I should also point out that I have a UI for my robot which makes use of Rviz to view the robot and map during operation. When I observe the transforms on the UI during mapping and localisation I don't notice anything odd so I'm a bit at a loss as to where the issue in my odometry could be. I'll keep investigating but I was wondering if you had any other insights based on these videos?
Databases:
Map database Database for localisation (Note: I had to switch to a ZED camera (ie. original ZED, no extra numbers or letters in the name) from my ZED 2 as it was having connection issues.)
Thanks for removing the pan tilt from the equation, that gives a better minimal example to reproduce the issue. I refactored in the commit above how the poses are provided to g2o. I think the main issue was that the map's poses were given in localization referential instead of the map referential, with priors set in map referential. When localization referential was 180 deg off the map referential, that produced the worst case scenario where the prior set to map's poses was 180 deg off on initialization of the optimizer, and g2o could not converge correctly to move back the map's poses accordingly to their prior. Another way to see the optimization not converging fast enough is by looking at the debug log where we see the chi2 stuck at a high value at the last iteration:
[ INFO] (2025-11-23 13:22:13.465) OptimizerG2O.cpp:1042::optimize() g2o optimizing begin (max iterations=20, robust=0)
[DEBUG] (2025-11-23 13:22:13.465) OptimizerG2O.cpp:1171::optimize() iteration 0: 57 nodes, 93 edges, chi2: 85676.867589
[DEBUG] (2025-11-23 13:22:13.465) OptimizerG2O.cpp:1171::optimize() iteration 1: 57 nodes, 93 edges, chi2: 74661.496830
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 2: 57 nodes, 93 edges, chi2: 63617.334901
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 3: 57 nodes, 93 edges, chi2: 49198.007150
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 4: 57 nodes, 93 edges, chi2: 36453.330453
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 5: 57 nodes, 93 edges, chi2: 22769.252748
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 6: 57 nodes, 93 edges, chi2: 22118.560976
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 7: 57 nodes, 93 edges, chi2: 7701.629163
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 8: 57 nodes, 93 edges, chi2: 6046.748927
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 9: 57 nodes, 93 edges, chi2: 1744.086124
[DEBUG] (2025-11-23 13:22:13.466) OptimizerG2O.cpp:1171::optimize() iteration 10: 57 nodes, 93 edges, chi2: 1525.181156
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 11: 57 nodes, 93 edges, chi2: 1515.201001
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 12: 57 nodes, 93 edges, chi2: 1509.799473
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 13: 57 nodes, 93 edges, chi2: 1505.373714
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 14: 57 nodes, 93 edges, chi2: 1501.027311
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 15: 57 nodes, 93 edges, chi2: 1495.426863
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 16: 57 nodes, 93 edges, chi2: 1484.422192
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 17: 57 nodes, 93 edges, chi2: 1444.599459
[DEBUG] (2025-11-23 13:22:13.467) OptimizerG2O.cpp:1171::optimize() iteration 18: 57 nodes, 93 edges, chi2: 1070.571830
[DEBUG] (2025-11-23 13:22:13.468) OptimizerG2O.cpp:1171::optimize() iteration 19: 57 nodes, 93 edges, chi2: 918.314349
[ INFO] (2025-11-23 13:22:13.468) OptimizerG2O.cpp:1220::optimize() g2o optimizing end (20 iterations done, error=918.314349, time = 0.002898 s)
The commit above helps to converge faster, so most of the localizations rejected are now accepted. However, for the case of the log above, we also need to increase the maximum iterations Optimizer/Iterations from 20 (default) to something higher, for example:
[ INFO] (2025-11-23 14:13:54.556) OptimizerG2O.cpp:1045::optimize() g2o optimizing begin (max iterations=40, robust=0)
[DEBUG] (2025-11-23 14:13:54.556) OptimizerG2O.cpp:1174::optimize() iteration 0: 58 nodes, 91 edges, chi2: 43743.988816
[DEBUG] (2025-11-23 14:13:54.556) OptimizerG2O.cpp:1174::optimize() iteration 1: 58 nodes, 91 edges, chi2: 29727.615443
[DEBUG] (2025-11-23 14:13:54.556) OptimizerG2O.cpp:1174::optimize() iteration 2: 58 nodes, 91 edges, chi2: 16038.753973
[DEBUG] (2025-11-23 14:13:54.556) OptimizerG2O.cpp:1174::optimize() iteration 3: 58 nodes, 91 edges, chi2: 11845.015041
[DEBUG] (2025-11-23 14:13:54.557) OptimizerG2O.cpp:1174::optimize() iteration 4: 58 nodes, 91 edges, chi2: 10699.779367
[DEBUG] (2025-11-23 14:13:54.557) OptimizerG2O.cpp:1174::optimize() iteration 5: 58 nodes, 91 edges, chi2: 8182.286772
[DEBUG] (2025-11-23 14:13:54.557) OptimizerG2O.cpp:1174::optimize() iteration 6: 58 nodes, 91 edges, chi2: 2183.989160
[DEBUG] (2025-11-23 14:13:54.557) OptimizerG2O.cpp:1174::optimize() iteration 7: 58 nodes, 91 edges, chi2: 1666.482508
[DEBUG] (2025-11-23 14:13:54.557) OptimizerG2O.cpp:1174::optimize() iteration 8: 58 nodes, 91 edges, chi2: 1049.340371
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 9: 58 nodes, 91 edges, chi2: 512.637711
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 10: 58 nodes, 91 edges, chi2: 388.086665
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 11: 58 nodes, 91 edges, chi2: 204.899353
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 12: 58 nodes, 91 edges, chi2: 69.319511
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 13: 58 nodes, 91 edges, chi2: 50.716559
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 14: 58 nodes, 91 edges, chi2: 19.734381
[DEBUG] (2025-11-23 14:13:54.558) OptimizerG2O.cpp:1174::optimize() iteration 15: 58 nodes, 91 edges, chi2: 10.509281
[DEBUG] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1174::optimize() iteration 16: 58 nodes, 91 edges, chi2: 9.249874
[DEBUG] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1174::optimize() iteration 17: 58 nodes, 91 edges, chi2: 8.902933
[DEBUG] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1174::optimize() iteration 18: 58 nodes, 91 edges, chi2: 8.843961
[DEBUG] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1174::optimize() iteration 19: 58 nodes, 91 edges, chi2: 8.841250
[DEBUG] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1174::optimize() iteration 20: 58 nodes, 91 edges, chi2: 8.841183
[DEBUG] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1174::optimize() iteration 21: 58 nodes, 91 edges, chi2: 8.841181
[ INFO] (2025-11-23 14:13:54.559) OptimizerG2O.cpp:1197::optimize() Stop optimizing, not enough improvement (0.000002 < 0.000010)
In this case, it converged after 21 iterations. Looking at the optimizer iterations statistics, we can see how many iterations were required for each localization.
Here the results of running the loc database on the map database:
https://github.com/user-attachments/assets/bcb2bcc5-ef59-4456-ad13-50d3c47eae7a
Good news! I tried the most recent version on my robot (both without and with camera pan and tilt) and it works! I tested without increasing Optimizer/Iterations from 20 initially and it seemed to work just fine so I just left it alone. However, I'll keep that in mind as something to check if I do have issues again later.
I'm sorry I didn't think to recheck removing pan and tilt earlier. I'd tested removing pan and tilt from the equation before making my initial post but that was before I realised that it had to do with the starting orientation of the wheelchair. I guess I must have made a mistake there (probably didn't start the test with the same orientation as I did for my other tests by mistake).
Just so I understand what you are saying here:
I think the main issue was that the map's poses were given in localization referential instead of the map referential, with priors set in map referential. When localization referential was 180 deg off the map referential, that produced the worst case scenario where the prior set to map's poses was 180 deg off on initialization of the optimizer, and g2o could not converge correctly to move back the map's poses accordingly to their prior.
When in localisation mode, the graph given to g2o to optimize was using positions that were referencing the initial position from the start of the localisation session as their origin while their priors were referencing the initial map position? The commits you did makes sure both the positions and the priors are using the same reference point now?
Yes, and yes. g2o seems having a hard time to converge when the provided initial poses are 180 deg off their corresponding priors.
Wonderful. :)
Thank you so much for your time and dedication in tracking this down. I really appreciate it.
I'll mark this issue as closed.