ORB_SLAM3 icon indicating copy to clipboard operation
ORB_SLAM3 copied to clipboard

Not stable RGB-D Inertial tracking (real time test on car)

Open Userpc1010 opened this issue 1 year ago • 20 comments

When running the RGB-D Inertial example on an Intel D455 camera, it takes a lot of movement to initialize and intermittently loses indoor tracking, calibrating the camera and imu sensors in kalibr did not significantly improve tracking. Is this the correct operation of ORB SLAM3 why does it take a lot of movement to initialize RGB-D like in the Monocular example?

https://www.youtube.com/watch?v=uDNA8HNGIoI&t=345s

My .yaml file:

%YAML:1.0

#--------------------------------------------------------------------------------------------
# Camera Parameters. Adjust them!
#--------------------------------------------------------------------------------------------
File.version: "1.0"

Camera.type: "PinHole"

# Right Camera calibration and distortion parameters (OpenCV)
Camera1.fx: 640.338
Camera1.fy: 638.673
Camera1.cx: 639.555
Camera1.cy: 362.661

Camera1.p1: 636.7834799899963
Camera1.p2: 364.1529749187936

# distortion parameters
Camera1.k1: -0.046595600657165025
Camera1.k2: 0.026160859189839854
Camera1.k3: -0.0003504572160801618
Camera1.k4: -0.0011346820837702665

# Camera resolution
Camera.width: 1280
Camera.height: 720

# Camera frames per second 
Camera.fps: 30

# Color order of the images (0: BGR, 1: RGB. It is ignored if images are grayscale)
Camera.RGB: 1

# Close/Far threshold. Baseline times.
Stereo.ThDepth: 60.0
Stereo.b: 0.094978

# Depth map values factor
RGBD.DepthMapFactor: 580.0  #1000 #current value tracking is more stable

# Transformation from body-frame (imu) to left camera
IMU.T_b_c1: !!opencv-matrix
   rows: 4
   cols: 4
   dt: f
   data: [0.9999410690311243, -0.0010983044171098205, 0.010800564443530714, -0.03416629640347083,
          0.0010513719550017482, 0.9999899856191709, 0.004350089468284401, -0.007858402423165139,
          -0.01080523400504305, -0.004338477702743224, 0.9999322099669149, -0.029943907325837224,
          0.0,                    0.0,                    0.0,                 1.0                 ]
          

# Do not insert KFs when recently lost
IMU.InsertKFsWhenLost: 1 #test for configure imu

# IMU noise (Use those from VINS-mono)
IMU.NoiseGyro: 0.015545769517023996 # rad/s^0.5 #test for configure imu
IMU.NoiseAcc: 0.15368276926571492  # m/s^1.5 #test for configure imu
IMU.GyroWalk: 2.37929770927606e-06 # rad/s^1.5
IMU.AccWalk:  4.116687773725591e-05 # m/s^2.5
IMU.Frequency: 400.0

#--------------------------------------------------------------------------------------------
# ORB Parameters
#--------------------------------------------------------------------------------------------
# ORB Extractor: Number of features per image
ORBextractor.nFeatures: 1250

# ORB Extractor: Scale factor between levels in the scale pyramid 	
ORBextractor.scaleFactor: 1.5

# ORB Extractor: Number of levels in the scale pyramid	
ORBextractor.nLevels: 8

# ORB Extractor: Fast threshold
# Image is divided in a grid. At each cell FAST are extracted imposing a minimum response.
# Firstly we impose iniThFAST. If no corners are detected we impose a lower value minThFAST
# You can lower these values if your images have low contrast			
ORBextractor.iniThFAST: 20
ORBextractor.minThFAST: 7

#--------------------------------------------------------------------------------------------
# Viewer Parameters
#--------------------------------------------------------------------------------------------
Viewer.KeyFrameSize: 0.05
Viewer.KeyFrameLineWidth: 1.0
Viewer.GraphLineWidth: 0.9
Viewer.PointSize: 2.0
Viewer.CameraSize: 0.08
Viewer.CameraLineWidth: 3.0
Viewer.ViewpointX: 0.0
Viewer.ViewpointY: -0.7
Viewer.ViewpointZ: -3.5
Viewer.ViewpointF: 500.0`

RGB Camera calib:

  cam0:
  T_cam_imu:
  - [0.9999410690311243, -0.0010983044171098205, 0.010800564443530714, -0.03416629640347083]
  - [0.0010513719550017482, 0.9999899856191709, 0.004350089468284401, -0.007858402423165139]
  - [-0.01080523400504305, -0.004338477702743224, 0.9999322099669149, -0.029943907325837224]
  - [0.0, 0.0, 0.0, 1.0]
  cam_overlaps: []
  camera_model: pinhole
  distortion_coeffs: [-0.046595600657165025, 0.026160859189839854, -0.0003504572160801618, -0.0011346820837702665]
  distortion_model: radtan
  intrinsics: [619.8078677901826, 619.6476531771294, 636.7834799899963, 364.1529749187936]
  resolution: [1280, 720]
  rostopic: /camera/color/image_raw
  timeshift_cam_imu: -0.00811566158846955

IMU Calib:

  imu0:
  T_i_b:
  - [1.0, 0.0, 0.0, 0.0]
  - [0.0, 1.0, 0.0, 0.0]
  - [0.0, 0.0, 1.0, 0.0]
  - [0.0, 0.0, 0.0, 1.0]
  accelerometer_noise_density: 0.0015368276926571492
  accelerometer_random_walk: 4.116687773725591e-05
  gyroscope_noise_density: 0.00015545769517023996
  gyroscope_random_walk: 2.037929770927606e-06
  model: calibrated
  rostopic: /camera/imu
  time_offset: 0.0
  update_rate: 400.0

Userpc1010 avatar Mar 09 '23 17:03 Userpc1010

Hi there, I’m going to be getting OS3 working with my Kinect v2 soon. I have OS3 working through ROS2 in monocular (no intertal mode) so once I write the code to publish kinect’s data over ROS I will be testing the RGBD performance. I do have a IMU I am considering using as well, as I know the performance increase in monocular mode is said to be large. I don’t see why RGBD would perform worse with integrated IMU data, or why it would provide less improvements than with stereo. I’ll update you on my findings, though I’m not sure if/when I’ll get the IMU included.

Have you found the performance of RGBD alone worse than with IMU data? I’ve seen your video on YouTube of RGBD alone but that is outside which is inherently more difficult for some RBGD cameras and orb slam

leahcornelius avatar Mar 10 '23 08:03 leahcornelius

I tested the RGB-D example without IMU, the result was more stable, the camera position did not leave the room and quickly found merges or detected loops after traking lost. I suppose that this is due to the prediction of the camera position by inertial sensors at the moments of loss of tracking, (betwen traking mesurments of camera) however, increasing the noise of the sensors in the configuration file did not give significant improvements, although it slowed down the camera drift in case of loss of tracking (but the quantity tracking lost and localization fails not changed), but a further increase in the noise values ​​is not correctly read by the parser after reading .yaml file reduced the input values noice in several hundred times and leading to more drift. I assume that the sensor can trancive acceleration measurements in a value different from that used in ORB_SLAM3, however, I have not yet figured out what units the data from imu should be in. Also, the initialization didn't happen fast enough until I changed the RGBD.DepthMapFactor to 580. I assume that this scale for translate depth measurements into the point cloud (getting distances to surrounding objects) also needs to be in distance units obtained from imu integration to correctly predict the camera position to do fusion data camera and imu. In my opinion, the inertial sensors on the D455 are really not very good in terms of noise performance. I can't imagine how long they can integrate the camera position with the accelerometer, maybe they really drift too fast, but its gyroscope is possible to get a stable camera rotation with them and therefore I would not want to abandon the IMU at all.

Userpc1010 avatar Mar 10 '23 13:03 Userpc1010

@Userpc1010 did you ever come across any fixes for the camera drift in loss tracking cases?

I too have experienced this issue and have attempted (1) increasing noise values, (2) trying different units for IMU output, and (3) adjusting ORB parameters. None of these adjustments seem to address the camera drift, which I suspect is somehow related to IMU integration.

eparham-88 avatar May 22 '23 12:05 eparham-88

I made some corrections to the method of resetting the camera initialization (it seems i to have increased the waiting time for the next frame) and also fixed an error in the yaml configuration by specifying the baseline (Stereo.b) not in centimeters (how at example config pdf) but in millimeters, set RGBD.DepthMapFactor to default and tracking became more stable in the room. I'm currently waiting for a mini pc (powered by a powerbank) to test the changes outdoor.

Userpc1010 avatar May 22 '23 12:05 Userpc1010

Interesting, I will try those fixes and see if the drift is mitigated. @Userpc1010 can you point me to which files and lines to adjust in the source code for increasing waiting time for the next frame? Thank you!

eparham-88 avatar May 22 '23 13:05 eparham-88

@userpc1010 I have the same question as @eparham-88

OPyshkin avatar May 29 '23 17:05 OPyshkin

I've done tests outdoor, they did not give unambiguous results, whether the change gave a positive result or it's in the initialization / calibration at startup. Outdoor, the track was lost only 1 time and quickly relocalized.

Userpc1010 avatar May 29 '23 17:05 Userpc1010

This is definitely a much better result than indoors (to up this topic). In the room, as before, track loss occur periodically, it seems at moments of fast camera rotation with a small radius from the center of rotation. Here are some of the changes I made:

%YAML:1.0

#--------------------------------------------------------------------------------------------
# Camera Parameters. Adjust them!
#--------------------------------------------------------------------------------------------
File.version: "1.0"

Camera.type: "PinHole"

# Right Camera calibration and distortion parameters (OpenCV)
Camera1.fx: 640.338
Camera1.fy: 638.673
Camera1.cx: 639.555
Camera1.cy: 362.661

Camera1.p1: -0.000210095
Camera1.p2: 0.000218042

# distortion parameters
Camera1.k1: -0.0560642
Camera1.k2: 0.0710607
Camera1.k3: -0.0235321
#Camera1.k4: -0.0011346820837702665

# Camera resolution
Camera.width: 1280
Camera.height: 720

# Camera frames per second 
Camera.fps: 30

# Color order of the images (0: BGR, 1: RGB. It is ignored if images are grayscale)
Camera.RGB: 1

# Close/Far threshold. Baseline times.
Stereo.ThDepth: 40.0
Stereo.b: 0.0094978

# Depth map values factor
RGBD.DepthMapFactor: 1000.0

# Transformation from body-frame (imu) to left camera
IMU.T_b_c1: !!opencv-matrix
   rows: 4
   cols: 4
   dt: f
   data: [0.999993, -0.00324738, 0.00169563, 0.0289787,
          0.00324059, 0.999987, 0.0039931, 0.00737133,
          -0.00170858, -0.00398758, 0.999991, 0.0157927,
           0.0,         0.0,        0.0,      1.0       ]
          

# Do not insert KFs when recently lost
IMU.InsertKFsWhenLost: 0

# IMU noise (Use those from VINS-mono)
IMU.NoiseGyro: 0.015545769517023996 # rad/s^0.5
IMU.NoiseAcc: 0.15368276926571492  # m/s^1.5
IMU.GyroWalk: 2.237929770927606e-03 # rad/s^1.5
IMU.AccWalk:  4.116687773725591e-02 # m/s^2.5
IMU.Frequency: 200.0

#--------------------------------------------------------------------------------------------
# ORB Parameters
#--------------------------------------------------------------------------------------------
# ORB Extractor: Number of features per image
ORBextractor.nFeatures: 1500

# ORB Extractor: Scale factor between levels in the scale pyramid 	
ORBextractor.scaleFactor: 1.2

# ORB Extractor: Number of levels in the scale pyramid	
ORBextractor.nLevels: 8

# ORB Extractor: Fast threshold
# Image is divided in a grid. At each cell FAST are extracted imposing a minimum response.
# Firstly we impose iniThFAST. If no corners are detected we impose a lower value minThFAST
# You can lower these values if your images have low contrast			
ORBextractor.iniThFAST: 20
ORBextractor.minThFAST: 7

#--------------------------------------------------------------------------------------------
# Viewer Parameters
#--------------------------------------------------------------------------------------------
Viewer.KeyFrameSize: 0.05
Viewer.KeyFrameLineWidth: 1.0
Viewer.GraphLineWidth: 0.9
Viewer.PointSize: 2.0
Viewer.CameraSize: 0.08
Viewer.CameraLineWidth: 3.0
Viewer.ViewpointX: 0.0
Viewer.ViewpointY: -0.7
Viewer.ViewpointZ: -3.5
Viewer.ViewpointF: 500.0

Снимок

Userpc1010 avatar May 29 '23 17:05 Userpc1010

@Userpc1010 thanks for a quick response! And what are your current hardware specs?

OPyshkin avatar May 29 '23 17:05 OPyshkin

@Userpc1010 thanks for a quick response! And what are your current hardware specs?

I use in outdoor test mini pc with intel i7 mobile 12th, 16Gb 3200MHz dual-chanale and mobile 3050Ti 4G. Indoor FX 8320, 8Gb 2666 MHz dual-chanale and nvidia 750Ti.

Userpc1010 avatar May 29 '23 17:05 Userpc1010

@Userpc1010 that's a nice setup. I'm currently trying to run orbslam on a mini pc with similar specs to jetson nano, and the initialization is failing on it, even though the same code works quite well on a normal computer. This is not related to calibration/extrinsics, because it proceeds to fail even on euroc datasets

OPyshkin avatar May 29 '23 17:05 OPyshkin

@Userpc1010 that's a nice setup. I'm currently trying to run orbslam on a mini pc with similar specs to jetson nano, and the initialization is failing on it, even though the same code works quite well on a normal computer. This is not related to calibration/extrinsics, because it proceeds to fail even on euroc datasets

Perhaps it is worth reducing the number of features? Also, to reduce the load, it is better to use the RGBD mode. The depth map and distortion correction from the camera is calculated on the intel camera's asic chip, which reduces the load on your computer.

Userpc1010 avatar May 29 '23 18:05 Userpc1010

@Userpc1010 that's a nice setup. I'm currently trying to run orbslam on a mini pc with similar specs to jetson nano, and the initialization is failing on it, even though the same code works quite well on a normal computer. This is not related to calibration/extrinsics, because it proceeds to fail even on euroc datasets

Perhaps it is worth reducing the number of features? Also, to reduce the load, it is better to use the RGBD mode. The depth map and distortion correction from the camera is calculated on the intel camera's asic chip, which reduces the load on your computer.

Yes, that helps, i've already tried it, however it looks like the waiting time for the next keyframe is hardcoded somewhere.

OPyshkin avatar May 29 '23 18:05 OPyshkin

The test on the car is not very successful next time I will try to make more loop to see the drift and check the RGB-D mode https://m.youtube.com/watch?v=yf8Eucorn90&feature=youtu.be

RGB-D Inertial mode seems to save more computing resources, relying on imu, it takes less pictures from the camera

Userpc1010 avatar Jun 01 '23 06:06 Userpc1010

@Userpc1010 that's a nice setup. I'm currently trying to run orbslam on a mini pc with similar specs to jetson nano, and the initialization is failing on it, even though the same code works quite well on a normal computer. This is not related to calibration/extrinsics, because it proceeds to fail even on euroc datasets

Perhaps it is worth reducing the number of features? Also, to reduce the load, it is better to use the RGBD mode. The depth map and distortion correction from the camera is calculated on the intel camera's asic chip, which reduces the load on your computer.

Yes, that helps, i've already tried it, however it looks like the waiting time for the next keyframe is hardcoded somewhere.

Finally, i managed to figure out that the issue wasn't in ORBSLAM's code, but in my wrapper. However a guide/documentation on how to configure certain parameters like init time, solver iterations, orb params would be great

OPyshkin avatar Jun 01 '23 16:06 OPyshkin

RGB-D (without imu) test, sunny day:

https://youtu.be/V2KbNDfBeEU

Final result: it's fail, no work any attems((

RGB-D Inertial test again, same day and no change:

https://youtu.be/wY5aPj4W5sQ

As a result, RGB D Inertial shows some signs of operability, but in its current form it is not really usable. It is necessary at least to save the IMU configuration to disk in order to avoid these annoying movements when starting the program to initialize the IMU. And also get rid of the camera position flying when the track is lost, the camera should remain in place and not continue to move in a random direction, it would not be bad to start a new track when tracking is lost, not a from the beginning, but continue the old track (end old track). The return to the beginning of the path where the old track intersects with the new one looks strange and not logical, ideally one should also add relocalization to inertial modes. And this necessary minimum. In this sense, to doubly sad that the RGB-D tests without imu failed. This mode did not require such modifications. ORB feature detector seems not very reliable((. I added my guesses about this to the description of the last test.

Userpc1010 avatar Jun 02 '23 15:06 Userpc1010

RGB-D (without imu) test, sunny day:

https://youtu.be/V2KbNDfBeEU

Final result: it's fail, no work any attems((

RGB-D Inertial test again, same day and no change:

https://youtu.be/wY5aPj4W5sQ

As a result, RGB D Inertial shows some signs of operability, but in its current form it is not really usable. It is necessary at least to save the IMU configuration to disk in order to avoid these annoying movements when starting the program to initialize the IMU. And also get rid of the camera position flying when the track is lost, the camera should remain in place and not continue to move in a random direction, it would not be bad to start a new track when tracking is lost, not a from the beginning, but continue the old track (end old track). The return to the beginning of the path where the old track intersects with the new one looks strange and not logical, ideally one should also add relocalization to inertial modes. And this necessary minimum. In this sense, to doubly sad that the RGB-D tests without imu failed. This mode did not require such modifications. ORB feature detector seems not very reliable((. I added my guesses about this to the description of the last test.

I did some experiments on losing track: at some point in euroc dataset i made a bunch of frames completely black, so the features get lost, and the track continued moving relying on IMU measurements. The IMU trajectory appeared to be quite noisy, however if the feature track was lost for <3 seconds it managed to somewhat successfully recover at the point where imu+visual slam should have been

OPyshkin avatar Jun 03 '23 19:06 OPyshkin

I compared the work of the orb detector and the contour detector and I think this could serve as a good replacement for the orb detector if work the contours on the line and compare them. What is more curious, the orb detector in opencv seems to be much more resistant to motion blur than the output of the orb-slam3 detector, but it is use OpenCV too, maybe legacy detector code to use orb-slam? Or it's a matter of depth from the camera, if the features do not match the depth, then they are not displayed in the detector window...

https://youtu.be/dnsttKShs54 OpenCV ORB detector

https://youtu.be/qVm4uxygjmA ORB-SLAM3 ORB detector

https://github.com/jiexiong2016/GCNv2_SLAM running performance with ORB and GCNv2 features

https://youtu.be/RjaB3O6st2E Contour detector

Userpc1010 avatar Jun 03 '23 19:06 Userpc1010

Does anyone have any idea about the ORB detector why there is such a difference in resistance to motion blur? If the ORB-SLAM3 detector worked at least as shown in the video from OpenCV ORB detector, there would certainly not be such problems with track loss. With depth, it seems to be in order, in any case, it me does not fall apart during sudden movements, in other things, all cameras with a global shutter and synchronization are not surprising.

Userpc1010 avatar Jun 04 '23 10:06 Userpc1010

I'm coming to the idea that orb-slam3's shortcomings can be put up with. The algorithm showed itself quite well in terms of inertial navigation on rather noisy sensors. If you use a more stable imu for example adis 16470 and a stereo camera with a larger baseline and approaching lenses for use more futures on long range in line movments, you can put up with instability in rotations and completely rely on imu at this moment, there are sensors capable of integrating the position up to 90 seconds during this time the rotation will end and slam will reset the error by cameras. Also, stereo cameras seem to be more stable and extracting more orb points to the eye. Thus can focus only on obtaining visual one-dimensionality at the moment of linear movement, without paying attention to the non-stability of the orb detector. When implementing such a piece of iron should immediately pay attention to the synchronization of the camera and the imu of the sensor's operation on interrupts and the simultaneous operation of the shutters of the cameras.

Userpc1010 avatar Jun 05 '23 20:06 Userpc1010