semantic_slam icon indicating copy to clipboard operation
semantic_slam copied to clipboard

Monocular Semantic Slam with ORB-SLAM2?

Open NicksonYap opened this issue 7 years ago • 7 comments

Hi,

ORB-SLAM2 supports monocular slam

I wonder if this code can be modified to use ORB-SLAM2's Monocular SLAM, instead of directly using RGB-D?

Thanks!

NicksonYap avatar Dec 10 '18 10:12 NicksonYap

Hi,

Yes, it is possible. In fact it was what we intended to do. The problem is that the depth estimated by monocular SLAM is relative (i.e. not in meters but relative with the initial scale). So when the camera moves around in the real world, the reconstruction will be broken because the point cloud are generated with the real depth.

I think there are two solutions to work with monocular SLAM.

  1. Use depth prediction to estimate the real scale from RGB images, like CNN-SLAM. Then integrate the estimated depth into ORB-SLAM2 to adjust the scale. Our initial attempts could be found in branch depth_prediction or in the project report.

  2. Use the depth estimated by monocular SLAM and generate a point cloud with this depth information. The drawback is that you won't have the real scale unless you can calibrate it. And ORB-SLAM2 is feature based SLAM so the map is sparse. Depth completion may be necessary to build a complete surface.

Meanwhile, either of them needs a lot of work to be done.

Xuan

floatlazer avatar Dec 10 '18 10:12 floatlazer

I'm not a pro, but Is actual scale really required for Semantic Slam or Semantic Segmentation in general?

Say that we give up on single camera, Would multi camera (2 and above) help?

Or at least integration/support for lower cost / more widely used RGB-D sensors such as the Orbbec Astra Pro or Intel Realsense D435 (going for 180 USD now)

NicksonYap avatar Dec 10 '18 16:12 NicksonYap

Stereo camera could work as ORB-SLAM2 supports stereo cameras.

Semantic segmentation is done based on RGB image. Depth information is only required for SLAM and reconstruction.

We used Asus xtion camera in our experiments, other low cost cameras should also work.

floatlazer avatar Dec 11 '18 03:12 floatlazer

@floatlazer I'm trying to buy the same sensor you're using. What's the exact model name of the ones used by you?

There is the regular Asus Xtion, Asus Xtion PRO Asus Xtion Live Asus Xtion PRO Live

All so confusing...

http://wiki.ipisoft.com/Depth_Sensors_Comparison#Xtion_Live_vs_Xtion_vs_Carmine (See the most bottom part)

PRO seems mean for Developers (same hardware but different software?) Live seems to mean it has an RGB sensor

Since RGB was used, I suppose you're using either the Asus Xtion Live or Asus Xtion PRO Live?

can you check if yours is "PRO" and whether it is needed (PRO is more costly)

NicksonYap avatar Dec 13 '18 13:12 NicksonYap

Created a new issue #12 regarding the sensor model

Please reply there, thanks!

NicksonYap avatar Dec 13 '18 13:12 NicksonYap

@NicksonYap were you able to implement Monocular semantic SLAM?

N-G17 avatar Aug 21 '19 20:08 N-G17

@Neetika-Gupta

Nope, did not give it a shot

NicksonYap avatar Aug 22 '19 12:08 NicksonYap