habitat-lab icon indicating copy to clipboard operation
habitat-lab copied to clipboard

Cannot reproduce slam-based agent's results

Open SenZHANG-GitHub opened this issue 4 years ago • 10 comments

❓ Questions and Help

Problem

Thank you for your great work!

When I tested habitat_baselines/agents/slam_agents.py on mp3d_val in the PointNav task with RGBD input, I only got a SPL score 0.013 which is far smaller than the one reported in the Habitat paper (0.42). Any hints how to make it right? Thanks a lot!

Results reported in Habitat

image

Savva, Manolis, et al. “Habitat: A Platform for Embodied AI Research.” ICCV, 2019.

Results reported in the precedent work

Also I noticed a performance gap of SLAM agents between Habitat and the original "Benchmarking Classic and Learned Navigation in Complex 3D Environments" where the reported spl is 0.702

image

Mishkin, Dmytro, et al. “Benchmarking Classic and Learned Navigation in Complex 3D Environments.” ArXiv Preprint ArXiv:1901.10915, 2019.

Though in their experiments the authors only use a subset of mp3d episodes, the performance gap (0.42 v.s. 0.702) still seems a little bit unreasonable

Steps to Reproceduce

I run "python habitat_baselines/agents/slam_agents.py --task-config configs/tasks/pointnav_rgbd_mp3d_val.yaml" where pointnav_rgbd_mp3d_val.yaml is almost the same as pointnav_rgbd.yaml except for an extra DATASET spefication

DATASET:
  TYPE: PointNav-v1
  SPLIT: val
  DATA_PATH: data/datasets/pointnav/mp3d/v1/{split}/{split}.json.gz

I did make some minimum modifications to slam_agents.py for it to work since there are some bugs related to mismatches towards newer versions of the whole project (1) The attribute BASELINE and TRAINER have both been deprecated in habitat-api 0.1.3, thus I replace all config.TRAINER.ORBSLAM2 to config.ORBSLAM2 and modify the operation w.r.t. config and agent_config in main() as

config = get_config()
agent_config = cfg_baseline()
agent_config.defrost()
config.defrost()
config.ORBSLAM2 = agent_config.ORBSLAM2
make_good_config_for_orbslam2(config)

(2) The code import angle_to_pi_minus_pi_2 as norm_ang but mixedly use these two which leads to an error, I just simply fixed all the function name to be angle_to_pi_minus_pi_2 (3) As pointed out in https://github.com/facebookresearch/habitat-api/issues/87, MAP_SIZE is set to balance between gpu memory and the map coverage. However, a small MAP_SIZE will fail if the estimatedGoalPos2D is outside the bound. I slightly increased the preset MAP_SIZE from 40 to 45, and hard-bounded the estimatedGoalPos2D to be within the map (450x450) in set_offset_to_goal(self, observation)

if self.estimatedGoalPos2D[0,0] > self.height_index_bound:
    self.ferr.write('position 0: {}\n'.format(self.estimatedGoalPos2D[0,0]))
    self.estimatedGoalPos2D[0,0] = self.height_index_bound
if self.estimatedGoalPos2D[0,1] > self.width_index_bound:
    self.ferr.write('position 1: {}\n'.format(self.estimatedGoalPos2D[0,1]))
    self.estimatedGoalPos2D[0,1] = self.width_index_bound

This might affect the result a little bit but I believe the effect should not be large since only 3 out of 495 episodes have outliers (episode 170: 450-466 at estimatedGoalPos2D[0,0]; episode 177: 452 at estimatedGoalPos2D[0,0]; episode 277: 459-487 at estimatedGoalPos2D[0,0]).

SenZHANG-GitHub avatar Jan 04 '20 09:01 SenZHANG-GitHub

@SenZHANG-GitHub thank you for bringing this up.

Though in their experiments the authors only use a subset of mp3d episodes, the performance gap (0.42 v.s. 0.702) still seems a little bit unreasonable

The Habitat PointGoal dataset should be harder in terms of episode distance and geodesic to Euclidean distance ratio. As well as simulators are different. Therefore, that numbers are incomparable.

I did make some minimum modifications to slam_agents.py for it to work since there are some bugs related to mismatches towards newer versions of the whole project

We are interested in those fixes and would be grateful if you can send those as PR.

When I tested habitat_baselines/agents/slam_agents.py on mp3d_val in the PointNav task with RGBD input, I only got a SPL score 0.013 which is far smaller than the one reported in the Habitat paper (0.42). Any hints how to make it right? Thanks a lot!

Maybe @ducha-aiki or @abhiskk can help you here. As I remember SLAM baselines is very sensitive to hyper parameters used, but that shouldn't be the case, as RGB shouldn't change with code changes. You also can output average distance to target for val set to check if there is no issue with calling stop.

mathfac avatar Jan 10 '20 07:01 mathfac

Hi,

That's weird. You are right that results should not differ that much. Probably something is off.

Though in their experiments the authors only use a subset of mp3d episodes, the performance gap (0.42 v.s. 0.702) still seems a little bit unreasonable

Due to simulator bug, some of the mp3d episodes were unsolvable in MINOS even for human: one cannot "jump" from one height level to another, so goals were unreachable. Thus we removed such episodes from evaluation. And, indeed, those episodes were "hard" in a sense that goal was far from the start.

One thing, I might suspect are image size, agent height and FOV. If they are incorrect, then the map created will be incorrect as well. If defaults have changed with some code change, it might be the case.

https://github.com/facebookresearch/habitat-api/blob/master/habitat_baselines/slambased/data/mp3d3_small1k.yaml#L8 https://github.com/facebookresearch/habitat-api/blob/master/habitat_baselines/slambased/mappers.py#L98

ducha-aiki avatar Jan 10 '20 11:01 ducha-aiki

@mathfac @ducha-aiki

Many thanks for your reply!

Yeah it makes sense now why MINOS and Habitat got different results : )

I'm now working on another project for a coming conference but I will check the hyper paramters settings and perhaps some code mismatches later. Hopefully we can make the slam-based agents work again : )

SenZHANG-GitHub avatar Jan 11 '20 23:01 SenZHANG-GitHub

Hi, Since you are talking about RGB input, can you please tell me how to configure the rl ppo trainer for RGB input? I changed the ppo_pointnav config : SENSORS=["RGB_SENSOR"]. Thus removing depth sensor. Is there anything else to be done? My reward curve is stuck at -0.09766. Thanks, Saurabh

saketd403 avatar Jan 23 '20 06:01 saketd403

Hi, I find the slam-based agent may perform random action. The code can be viewed on https://github.com/facebookresearch/habitat-api/blob/71d409ab214a7814a9bd9b7e44fd25f57a0443ba/habitat_baselines/agents/slam_agents.py#L356 What is the purpose of random action? It may cause early stop in one episode and SPL is zero.

StOnEGiggity avatar Mar 25 '20 00:03 StOnEGiggity

It happens that agent gots stuck near small obstacle, which is not mapped. Small amount of random actions fixes the issue and improves results.

ducha-aiki avatar Mar 26 '20 13:03 ducha-aiki

Hii, I also wanted to reproduce results of SLAM-based agent . While running python habitat_baselines/agents/slam_agents.py I get following error:

File "habitat_baselines/agents/slam_agents.py", line 17, in <module>
    import orbslam2
ModuleNotFoundError: No module named 'orbslam2'

Is orbslam2 module there with habitat-lab or to be installed separately? I will be very gratefull if you help me regarding this.

KenaHemnani avatar Dec 14 '20 14:12 KenaHemnani

It should be installed separately, please follow this guide https://github.com/facebookresearch/habitat-lab/tree/master/habitat_baselines/slambased

ducha-aiki avatar Dec 14 '20 14:12 ducha-aiki

Thank you very much @ducha-aiki for you help and response. Can you give some hint how to install ORB-SLAM2 .Thank you in advance.

KenaHemnani avatar Dec 15 '20 09:12 KenaHemnani

Hi @SenZHANG-GitHub @ducha-aiki @mathfac I run "python habitat_baselines/agents/slam_agents.py --task-config configs/tasks/vln_r2r.yaml"

The following error occurred:

Trade::StanfordImporter::mesh(): incomplete face data Containers::Optional: the optional is empty 已放弃 (核心已转储)

and When I run "python habitat_baselines/agents/slam_agents.py --task-config configs/tasks/pointnav_rgbd.yaml" The following error occurred:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 11.76 GiB total capacity; 9.35 GiB already allocated; 32.50 MiB free; 9.48 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CON

How can I solve it? Thank you

zhangyong511 avatar Aug 26 '22 08:08 zhangyong511