rtabmap_ros icon indicating copy to clipboard operation
rtabmap_ros copied to clipboard

Request: Source code for MIT Stata Center dataset preprocessing (timestamps, 15 Hz, GT alignment)

Open lin-chain311 opened this issue 4 months ago • 5 comments

Hello,

I’m studying topics related to RTAB-Map and trying to reproduce the results from Section 4.4 (MIT Stata Center) of the paper “

Journal of Field Robotics - 2018 - Labbé - RTAB‐Map as an open‐source lidar and visual simultaneous localization and mapping library for large‐scale and long‐term online operation.pdf

The paper notes that the MIT Stata Center dataset requires several preprocessing steps. After downloading the dataset, I encountered similar issues—especially timestamp misalignment across sensors—which causes unstable or inaccurate visual-odometry estimates and, at times, runtime errors.

Could you please advise whether there are example resources for this dataset, such as:

  • Sample preprocessing scripts or a small pipeline tailored to the MIT Stata Center data.
  • Example ROS launch files showing recommended topics, frame names, and synchronization settings.
  • Code or guidance to republish the image stream at 15 Hz.
  • Tools, scripts, or recommendations to align ground-truth timestamps with the sensor data (e.g., offset estimation or resampling).

My setup:

  • Visual odometry: RGB-D camera
  • LiDAR
  • IMU

Thank you in advance for any pointers or example code that could help reproduce the MIT Stata Center experiment more faithfully.

lin-chain311 avatar Aug 27 '25 07:08 lin-chain311

Unfortunately, it looks like I didn't take the time to write exhaustive instructions on how to pre-process the bags and to reproduce the results of the paper. I think everything we need is there though: https://github.com/introlab/rtabmap_ros/tree/master/rtabmap_legacy/launch/jfr2018#readme.

I think the IMU was not used in the results.

matlabbe avatar Sep 03 '25 00:09 matlabbe

Hi and thanks a lot for your great work — it’s been extremely helpful to me.

I’m running visual odometry in F2M mode on the MIT dataset. On sequences with in-place rotations or near-zero translation, I observe that the pose estimate becomes unstable and sometimes drifts severely enough to cause VO failure. To mitigate this, I fed IMU measurements into rgbd_odometry, hoping to reduce the well-known rotation degeneracy. However, the improvement was modest.

From reading the source code, it looks like the IMU is mainly used to provide an initial rotation guess for each frame, while the final pose is still determined visually, the other is during local bundle adjustment I also enable a gravity constraint (“gravity edge”). But my estimator is currently constrained to 3DOF (SE(2): x-y-yaw). Because roll/pitch are fixed in 3DOF, the gravity edge seems to have little to no effect in practice — which might explain why adding IMU doesn’t help much in my case. If that understanding is wrong, please correct me.

Transform imuCurrentTransform;
if (!guessIn.isNull()) // defult guessIn is null
{
	guess = guessIn;
}
else if (!imus_.empty())
{
	UWARN("replace orientation guess with IMU");
	// replace orientation guess with IMU (if available)
	imuCurrentTransform = Transform::getTransform(imus_, data.stamp());
	if (!imuCurrentTransform.isNull() && !imuLastTransform_.isNull())
	{
		Transform orientation = imuLastTransform_.inverse() * imuCurrentTransform;
		guess = Transform(
			orientation.r11(), orientation.r12(), orientation.r13(), guess.x(),
			orientation.r21(), orientation.r22(), orientation.r23(), guess.y(),
			orientation.r31(), orientation.r32(), orientation.r33(), guess.z());
		if (_force3DoF)
		{
			guess = guess.to3DoF();
		}
	}
	else if (!imuLastTransform_.isNull())
	{
		UWARN("Could not find imu transform at %f", data.stamp());
	}
}

My questions

  • Is IMU currently used only as an initial prior for rotation, rather than tightly fused in the estimator? If there is a tightly-coupled VIO path, how can I enable it?
  • Are there recommended parameters for rotation-dominant sequences in F2M? For example, keyframe selection thresholds, feature/inlier limits, or any IMU noise/weighting, and time-sync settings that are important for stability?
  • When you evaluated on the MIT dataset, how did you address segments with pure or near-pure rotation?

Why I expected IMU to help more Prior VIO literature (leutenegger-et-al-2014-keyframe-based-visual-inertial-odometry-using-nonlinear-optimization.pdf) highlights that VO can degrade under low parallax / pure rotation, and using IMU typically improves observability and robustness. I’m not sure whether I misunderstood the paper or whether some aspects weren’t fully detailed.

I likely misconfigured something or misunderstood the intended IMU usage. Any guidance on proper configuration or on whether a tighter visual-inertial fusion path exists would be super helpful. Thanks again for your time and for maintaining this project!

lin-chain311 avatar Sep 09 '25 03:09 lin-chain311

F2M+IMU is loosely coupled VIO where, as you observed, IMU helps to track features, but ultimately the final estimation is purely visual. It cannot estimate the pose if not enough visual features cannot be tracked. To use tightly-coupled VIO, you would need to integrate an existing third party approach (for example by replacing visual_odometry ode by another third party node doing VIO).

I think the default settings should work on most of the sequences, though in this file launch_rgbd and this one launch_stereo there are some custom parameters.

Pure rotation should not be a problem (even without IMU) if there are enough good visual features to track and update rate is fast enough for the rotation speed. I remember one or two places where the robot entered a small room and the camera field of view contained mostly a large white wall or very low number of good features. VO needed to not miss a frame at 15 hz or it got lost.

Where exactly in that paper it talks about low parallax/ pure rotation? Monocular SLAM cannot work with pure rotation, but stereo SLAM can. Adding an IMU to a monocular SLAM system (VIO) would make it work even in pure rotations. The biggest advantages I see of including IMU is to have absolute roll and pitch estimation (aligned with gravity all the time), and that if there are not enough visual features to track, you can momentary estimate on IMU estimation. You can try ARCore or ARKit with your phone and block the camera and see how the tracking goes.

You may look at approaches like OpenVINS or msckf-vio to do tightly VIO.

matlabbe avatar Sep 10 '25 02:09 matlabbe

In the commit above, I updated the README with some examples: https://github.com/introlab/rtabmap_ros/tree/master/rtabmap_legacy/launch/jfr2018#readme

matlabbe avatar Sep 18 '25 04:09 matlabbe

Thank you, this helps me a lot

lin-chain311 avatar Sep 30 '25 08:09 lin-chain311