ORB_SLAM2-documented icon indicating copy to clipboard operation
ORB_SLAM2-documented copied to clipboard

Feature's detection effect on initialization

Open KarimHabbab92 opened this issue 5 years ago • 6 comments

Hello @AlejandroSilvestri Thank you so much for your documented version of ORBSLAM, it was so helpful for me. how ever I have some questions related to feature's detection effect on initialization time, I am still new to computer vision and I do not understand why FAST could detect some features in some frames and when re-running the same video they it could not? and what is their effect on initialization time of the algorithm? could you please provide any suggestion in the direction of enhancing the initialization time ? Thanks in Advance

KarimHabbab92 avatar Aug 22 '19 07:08 KarimHabbab92

@KarimHabbab92,

FAST is deterministic, with the same parameters on the same image the resulting keypoints will be the same.

ORB-SLAM2 use a quadtree (with the odd name ComputeKeyPointsOctTree) to limit the quantity of keypoints crowded in a small area. And because this quadtree function doesn't clean its variables each time it starts, the result varies a little. So, quadtree get rid of different keypoints each time, but the variation can't be dramatic. If you pause a video only a few keypoints blink (because sometimes they are discarded, sometimes not).

I don't know what are you seeing to state FAST sometimes detects features and sometimes not. Initialization is tricky. It take a frame as a reference and keep matching every new frame with that, until those matches lead to a good initialization. With every new frame some matches are lost, and then there are fewer matches until falling below 100: when this happens this frame becomes a new reference an the process starts over, again and again until a good initialization.

Because the above mentioned variations, this process is not deterministic, and each time leads to a different initialization.

AlejandroSilvestri avatar Aug 22 '19 12:08 AlejandroSilvestri

Thank you very much for the response! @AlejandroSilvestri Actually I assumed that it does not catch the same features because of different initialization time on the same video, and even some videos with the same nature (Residential area for example), some times it took up to 46 sec to be initialized, although I am using the binary Vocabulary file suggested by some developer in github. that’s why I start to think of extracting some frames after feature extraction stage so I might find some thing interesting here leads me to deeper understanding on what’s going on. Ps: I am doing this development for my master thesis. Thanks again for the response your time is much appreciated.

KarimHabbab92 avatar Aug 22 '19 13:08 KarimHabbab92

@KarimHabbab92 ,

I believe the main problem is in descriptor matching. Descriptors are meant to uniquely identify a keypoint by its surrounding visual appearance. ORB-SLAM2 uses BRIEF descriptors (because ORB uses them) which are fast and compact, but not that reliable as other like SURF descriptors.

So, as the camera moves, the image get a different perspective of the scene, descriptors begin to fail matching. Initialization is a process which rely entirely on a high amount of matches during a change of perspective, that's why ORB-SLAM2 is so hard to initialize. There is enough room for improvements in this area. Because initialization is not at the core of visual slam, I believe authors didn't focus on it.

ORB-SLAM2 implement two algorithms for initialization: homography and PnP. I don't know why, in my tests the initialization always happened with homography, and never with PnP. This is something that worth analysing.

AlejandroSilvestri avatar Aug 22 '19 15:08 AlejandroSilvestri

Thank you so much for this helpful instructions. I will start researching about BRIEF descriptor and its possible alternatives. and I will analyse Homography and PnP algorithms. Thanks again !!

KarimHabbab92 avatar Aug 23 '19 12:08 KarimHabbab92

@KarimHabbab92

If you are interest, look for these papers:

  • FAST: features from accelerated segment test
  • BRIEF: Binary Robust Independent
  • ORB: an efficient alternative to SIFT or SURF
  • ORB-SLAM2: ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras

For PnP (Perspective from N Points):

  • EPnP: An Accurate O(n) Solution to the PnP Problem

EPnP is the one used in ORB-SLAM2.

Many of them are explained in opencv reference.

AlejandroSilvestri avatar Aug 23 '19 15:08 AlejandroSilvestri

Thank you so much !

KarimHabbab92 avatar Aug 23 '19 15:08 KarimHabbab92