habitat-sim icon indicating copy to clipboard operation
habitat-sim copied to clipboard

Geodesic distance return inf in many cases.

Open erick84mm opened this issue 4 years ago • 23 comments

Issue:

The geodesic distance returns inf in many cases. So I was running my experiments and I found that on many occasions the geodesic distance returns inf which makes the distance_to_goal measurement in habitat-API unusable. However, I have verified that there is a possible walking path from the source point to the endpoint in this example.

This is a really bad problem for the SPL calculation.

For example in the part (see next code) if distance_to_target is inf in the last step. Even if the episode is successful the score does not count.

distance_to_target = self._sim.geodesic_distance(
    current_position, episode.goals[0].position
)

if (
    hasattr(task, "is_stop_called") and
    task.is_stop_called and
    distance_to_target < self._config.SUCCESS_DISTANCE
):
    ep_success = 1

Steps to reproduce:

  1. Load the map "mp3d/vyrNrziPKCB/vyrNrziPKCB.glb" from Matterport.
  2. Run the following code.
import habitat
from habitat.sims.habitat_simulator.actions import HabitatSimActions

def transform_rgb_bgr(image):
    return image[:, :, [2, 1, 0]]


def example():
    env = habitat.Env(
        config=habitat.get_config("configs/tasks/pointnav.yaml")
    )
    observations = env.reset()
    
    distance = env._sim.geodesic_distance(
        [7.742532,  0.998569, 12.11764],[2.055959939956665, 0.19856905937194824, 7.356540203094482]
    )
    
    print(distance)
   
if __name__ == "__main__":
    example()

erick84mm avatar Dec 26 '19 12:12 erick84mm

[7.742532, 0.998569, 12.11764] is a very odd position (screen shot attached) and I don't think the target location should be reachable from there. How did the agent manage to get there? If you have a full trajectory of positions or even just the previous position, that would be helpful.

image

erikwijmans avatar Dec 26 '19 20:12 erikwijmans

Thank you for your reply. In this case, every position is a translation of the VLN dataset which corresponds to the position of the cameras in the environment. However, the z coordinate is a bit weird since I couldn't find the correct mapping from VLN to habitat so I use the VLN z value at each viewpoint and then record the corrected z that the environment calculated by itself. That may be part of the issue. **If you have the answer to this unrelated question please don't hesitate in telling me, I will fix my issues. I think right now the agent is at 1.5 meters from the floor. ** However, I still think that any position may have at least a value and not infinity since we are calculating the SPL that way.

By the way, which sensor are you using in the viewer (picture that you shared?) I notice its better quality I might be interested in using that sensor.

Here is another example, I think this is not so odd position as before. So far I am working with a random agent that takes decisions randomly in with the action space [ 'TURN_RIGHT', 'TELEPORT', 'STOP']

In this case, I have included a log of actions taken by the agent, the "new_" means the destination and "prev_" means the origin.

The image represents the view of the agent after the first 'TURN_RIGHT' action happens.

image

  1. Loaded navmesh data/scene_datasets/mp3d/vyrNrziPKCB/vyrNrziPKCB.navmesh

[{'action': 'TURN_RIGHT', 'new_heading': array(-1.70720336), 'new_image_id': '60438933298549fcbc9d4d2d25a0b852', 'new_pos': array([23.0007 , 3.598569, 16.1745 ], dtype=float32), 'new_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0), 'prev_heading': array(3.0051853), 'prev_image_id': '60438933298549fcbc9d4d2d25a0b852', 'prev_pos': array([23.0007 , 3.598569, 16.1745 ], dtype=float32), 'prev_rot': quaternion(0.0681508108973503, 0, 0.997675001621246, 0)}, {'action': 'TELEPORT', 'new_heading': array(-1.70720336), 'new_image_id': '9c242e45add84ffca6140c5eb7ae7d70', 'new_pos': array([19.8864 , 3.598569, 13.932 ], dtype=float32), 'new_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0), 'prev_heading': array(-1.70720336), 'prev_image_id': '60438933298549fcbc9d4d2d25a0b852', 'prev_pos': array([23.0007 , 3.598569, 16.1745 ], dtype=float32), 'prev_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0)}, {'action': 'TELEPORT', 'new_heading': array(-1.70720336), 'new_image_id': '86eba2e0b139463d9dc419ea6b71af87', 'new_pos': array([22.627 , 3.598569, 13.8074 ], dtype=float32), 'new_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0), 'prev_heading': array(-1.70720336), 'prev_image_id': '9c242e45add84ffca6140c5eb7ae7d70', 'prev_pos': array([19.8864 , 3.598569, 13.932 ], dtype=float32), 'prev_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0)}, {'action': 'TELEPORT', 'new_heading': array(-1.70720336), 'new_image_id': '9c242e45add84ffca6140c5eb7ae7d70', 'new_pos': array([19.8864 , 3.598569, 13.932 ], dtype=float32), 'new_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0), 'prev_heading': array(-1.70720336), 'prev_image_id': '86eba2e0b139463d9dc419ea6b71af87', 'prev_pos': array([22.627 , 3.598569, 13.8074 ], dtype=float32), 'prev_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0)}, {'action': 'TELEPORT', 'new_heading': array(-1.70720336), 'new_image_id': '86eba2e0b139463d9dc419ea6b71af87', 'new_pos': array([22.627 , 3.598569, 13.8074 ], dtype=float32), 'new_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0), 'prev_heading': array(-1.70720336), 'prev_image_id': '9c242e45add84ffca6140c5eb7ae7d70', 'prev_pos': array([19.8864 , 3.598569, 13.932 ], dtype=float32), 'prev_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0)}, {'action': 'STOP', 'new_heading': array(-1.70720336), 'new_image_id': '86eba2e0b139463d9dc419ea6b71af87', 'new_pos': array([22.627 , 3.598569, 13.8074 ], dtype=float32), 'new_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0), 'prev_heading': array(-1.70720336), 'prev_image_id': '86eba2e0b139463d9dc419ea6b71af87', 'prev_pos': array([22.627 , 3.598569, 13.8074 ], dtype=float32), 'prev_rot': quaternion(0.65727299451828, 0, -0.753652572631836, 0)}]

Target path ['60438933298549fcbc9d4d2d25a0b852', '8880585c8f53461fbe827765c2cb6fc7', '4c9fab5cc3254cba83b3768b25f1b105', 'b0e387fe3ac24aad80279f6485d31fd6', '05a7eee165964cffb1a6ae6ace6a8c8a', 'efd2f06dc9de45598b70828ba6168bd8', '1f169ea3d73c4cc8b7f653c4d4ff15cb']

{'distance_to_goal': {'agent_path_length': 12.067958928537701, 'distance_delta': nan, 'distance_to_target': inf, 'start_distance_to_target': inf}, 'navigationError': inf, 'oracleSuccess': 0, 'spl': 0.0, 'success': 0, 'trajectoryLength': 12.067958928537701}

** PS ** Happy Holidays! - even if you might not celebrate them I want to share the feeling of joy. Thank you for always helping me with this I really appreciate it.

erick84mm avatar Dec 27 '19 05:12 erick84mm

The mapping between VLN-R2R panorama locations and habitat-sim is non-trivial and it's likely the infs you are running into are from the mapping not being correct. This PR did all the work to create that mapping: https://github.com/facebookresearch/habitat-api/pull/233 (download link for data here: https://github.com/facebookresearch/habitat-api#task-datasets) You should use the locations there. (cc @jacobkrantz @koshyanand)

I am just using habitat-sim's viewer. It isn't any higher quality, just higher resolution (you can render at an arbitrary resolution).

** PS ** Happy Holidays! - even if you might not celebrate them I want to share the feeling of joy. Thank you for always helping me with this I really appreciate it.

Thank you!

erikwijmans avatar Dec 27 '19 05:12 erikwijmans

Unfortunately, I have compared both locations, and we have the same In all cases. The PR removed 3220 locations with this problem from the dataset. (I checked 3-5 and there are not in the training data).

Also, I played with the simulator for a bit and it seems that the reason is that the agent gets stuck in a glass door, (the agent should be able to pass through the door but it doesn't).

out

Also, I checked the code and it thinks this section is an island so it returns inf.

erick84mm avatar Dec 27 '19 11:12 erick84mm

The PR removed 3220 locations with this problem from the dataset.

Yeah, due to reconstruction errors and artifacts, there were a considerable number of locations that simply didn't have a logical mapping. AFAIK, the paths released in that PR should all be traverse-able and not have this inf geo-distance issue.

Also, I played with the simulator for a bit and it seems that the reason is that the agent gets stuck in a glass door, (the agent should be able to pass through the door but it doesn't).

We are aware that there are oddities in the simulator caused by reconstruction errors and artifacts. Reconstruction is a hard problem and very open area of research (as is fixing reconstruction errors).

erikwijmans avatar Dec 27 '19 18:12 erikwijmans

I see that makes sense. Then I understand the problem now. I will try to code around it. Thank you for the explanation.

erick84mm avatar Dec 28 '19 03:12 erick84mm

@erick84mm, a geodesic distance is calculated on top of navigation mesh that contains navigable points and is calculated for particular agent's size. If you have other logic in mind you can try to construct your own custom distance metric that can be based on top of geodesic distance and Euclidian distance. As well as you can use a snap_point function to get closes navigable point in some cases.

mathfac avatar Dec 29 '19 00:12 mathfac

@mathfac Thank you for your answer. I want to ask you about the calculation of the z value translated from matterport to habitat, Did you took the information on the house file and the normalized value of matterport (in range 0-10) to determine the z value in habitat? Or did you use another method?

And yes at this moment what I'm thinking is the following:

For the discrete case (In my implementation I kept the notion of viewpoints and graph): I can calculate the shortest path distance using Dijkstra as Matterport sim does. That is no problem.

For the continuous case (non-graph, continuous motion): I was planning to check which of the viewpoints or directions the agent can see but it collides in the process. I am using the connectivity files from matterport sim form to check for visible viewpoints. So If the agent gets close to a valid viewpoint but it collides I will search for the snap_point and teleport the agent there and continue the navigation. So effectively I will check if the collision is valid in terms of matterport sim.

I have tested manually some scenarios and it seems to work so I will try to replicate this in code.

erick84mm avatar Dec 29 '19 03:12 erick84mm

I want to ask you about the calculation of the z value translated from matterport to habitat

The z values aren't translated, the whole mesh is rotated so that -y is the gravity direction (as opposed to -z in the source meshes and in Matterport sim). This can be done in python via

import habitat_sim
from habitat_sim.utils.common import quat_from_two_vectors, quat_rotate_vector

q_habitat_mp3d = quat_from_two_vectors(np.array([0.0, 0.0, -1.0]), habitat_sim.geo.GRAVITY)
pt_habitat = quat_rotate_vector(q_habitat_mp3d, pt_mp3d)

erikwijmans avatar Dec 29 '19 06:12 erikwijmans

@erick84mm thank you for the update and @erikwijmans for answering the question. If you manage to find a reasonable correspondence to 3320 dropped locations from VLN dataset, let us know.

mathfac avatar Jan 06 '20 20:01 mathfac

tl:dr, in order to translate matterport points to habitat in addition to the code mention by @erikwijmans you need to subtract the camera z value with respect to the agent (agent_state.sensor_states["rgb"].position).

@erikwijmans Thank you so much for the code, however, I think there is a misunderstanding here, and I think I have found the reason for it, maybe you already knew about this but it is new to me so I will share here.

While your code works and provides valid coordinates on paper. Habitat simulator divides the z value in 2 parts, 1. the position of the camera with respect to the agent and 2. the position of the agent itself. So in this case for example if I have the matterport point pt_mp3d and I follow your code it results in the point pt_habitat as follow:

import habitat_sim
from habitat_sim.utils.common import quat_from_two_vectors, quat_rotate_vector
pt_mp3d = [2.345, 1.78, 1.5698]

q_habitat_mp3d = quat_from_two_vectors(np.array([0.0, 0.0, -1.0]), habitat_sim.geo.GRAVITY)
pt_habitat = quat_rotate_vector(q_habitat_mp3d, pt_mp3d)

//**The result is the pt_habitat = [2.345, 1.5698, -1.78]**

The resulting point is valid on its own, but when I input that point in the habitat simulator, the image goes to a higher position than expected.

For example, the image you show before is an example of that happening. image

So what is happening is that while pt_habitat = [2.345, 1.5698, -1.78] is a valid translation, the habitat simulator divides the z value between the camera position and the position of the agent. The camera position is fixed and it follows the dataset while the agent position z value is modified accordingly. In summary, Instead of using the pt_habitat after the translation, the pt_habitat should be interpreted as:

camera_position = agent_state.sensor_states["rgb"].position z = camera_position[1] agent_state.position = [2.345, 1.5698 - z, -1.78]

Since the camera position is not available at the start, I took the z values of the camera from the camera parameters in the matterport camera parameters file, I used undistorted_camera_parameters file for that.

By using that I was able to calculate the real value of the agent position and the camera position which in this example has to be something like pt_habitat = [2.345, 0.0701, -1.78] since the camera z value was at 1.4997 according to the undistorted_camera_parameters file.

By following that procedure I was able to find the real point that is the same in both simulators.

erick84mm avatar Jan 07 '20 00:01 erick84mm

@mathfac yes I follow your advice to use the snap point. So I did the following:

  1. First I tried to calculate the geodesic distance, between start_point and goal.
  2. If the result was inf I searched a snap_point for the start_point since I know the goal should be reachable.
  3. Then I tried to calculate the geodesic distance again and provide an answer.
  4. If the answer is still inf that means that the place is unreachable otherwise is ok.
  5. I checked all matterport points in the shortest path setting and they all work as expected. Although it results on a navigation error of about 0.02 meters and the SPL goes to 99.89 since the agent has to travel a bit more of distance.

Something like this.

            distance_to_target = self._sim.geodesic_distance(
                current_position, episode.goals[-1].get_position()
            )

            if np.isinf(distance_to_target):
                print(
                    "WARNING: The SPL metric might be compromised " +
                    "The geodesic_distance failed " +
                    "looking for a snap_point instead"
                )
                new_position = np.array(current_position, dtype='f')
                new_position = self._sim._sim.pathfinder.snap_point(
                                new_position
                               )
                if np.isnan(new_position[0]):
                    print(
                        "ERROR: The SPL metric is compromised " +
                        "The geodesic_distance failed " +
                        "Cannot find path"
                    )
                else:
                    current_position = new_position
                    distance_to_target = self._sim.geodesic_distance(
                        current_position, episode.goals[-1].get_position()
                    )

So In summary:

All the dropped points can be translated using the procedure in the post above, and then the navigation can happen in both continuous and discrete forms by using the snap point functionality as shown in the code here, so we can now use the full dataset without any problem, given that we accept a navigation error of 0.02 meters and an SPL score of 99.89% due to the snap_point mechanic.

erick84mm avatar Jan 07 '20 01:01 erick84mm

camera_position = agent_state.sensor_states["rgb"].position z = camera_position[1] agent_state.position = [2.345, 1.5698 - z, -1.78]

Since the camera position is not available at the start, I took the z values of the camera from the camera parameters in the matterport camera parameters file, I used undistorted_camera_parameters file for that.

Careful with this. The camera height in the Matterport3D simulator is not consistent (because it is not consistent in the matterport3D dataset itself), most of the time it is ~1.25m, but often higher or (a lot) lower. Habitat allows the agent to be set to an arbitrary location (even weird/non-navigable ones) so an individual render will look fine, but moving forward will no longer work as expected.

If the result was inf I searched a snap_point for the start_point since I know the goal should be reachable.

This assumption is wrong. Some of the goals will no longer be reachable due to reconstruction errors -- the navigation graph in the Matterport3D simulator contains transitions that cannot be sensibly approximated in habitat-sim.

The procedure outlined seems dangerous. I am not sure how the search procedure is done, but it seems likely that the found point which has a path to the goal will no longer be valid for the corresponding natural language instruction.

erikwijmans avatar Jan 07 '20 02:01 erikwijmans

@erikwijmans Thank you again for your comments really valuable indeed!. And yes, you are right the Matterport3D camera settings are really inconsistent. However, the actions MOVE_FORWARD, TURN_LEFT, TURN_RIGHT..., automatically corrects the z value with respect to the agent position. Basically ignores the z and puts whatever makes sense... I think this is because of the physics engine and it's interaction with the agent size and environment gravity... so even when the value is slightly higher or lower, the environment will reset it to the default position it should have...

So the main reason I wanted to find the correct value is that I wanted to get an accurate first RGB image from the simulator since after the first action is performed the z value will be corrected. This approach is very odd, but it worked until I discovered the height values in the undistorted_camera_parameters and realized that I can get the exact z value by substracting that height.

As for the second point. Noted. I noticed the same, however, when I find a position that requires a snap_point I basically use the "TELEPORT" action instead of "MOVE_FORWARD" so we can just jump to the position that is correct. As far as I understand, the snap_point does the following: "Will only search within a 4x8x4 cube centered around the point", this strategy doesn't seem so good at first... but I tested with the full dataset and it seems to work...

Just to be sure I will perform more tests and when I see a snap_point I will print both images to see which is the snap_point selected, I did this as a test but maybe I can look at the errors in some more just to check everything goes well.

erick84mm avatar Jan 07 '20 04:01 erick84mm

Just to clarify, snap_point is deterministic. i.e. snap_point(pt) == snap_point(pt) and if pt was already a navigable location, then snap_point(pt) == pt. Also, it will simply find the closest navigable location to the given point and there is no way to constrain it (i.e. there is no way to tell it to try to find a point in the same connected component as another point).

use the "TELEPORT" action instead of "MOVE_FORWARD"

Ah, if you are using teleport, then I can see how this may work.

erikwijmans avatar Jan 07 '20 16:01 erikwijmans

I find similar inf issues in the Gibson dataset when I apply continuous action space rather than discrete action space. The initial agent tends to walk randomly, it seems easy to happen when there are different y-axis values between the current position and the target goal.

StOnEGiggity avatar Jan 14 '20 12:01 StOnEGiggity

when I apply continuous action space rather than discrete action space

What are you referring to here? We don't have a standardized continuous action space.

erikwijmans avatar Jan 14 '20 15:01 erikwijmans

Maybe TELEPORT is considered an action with continuous action space.

mathfac avatar Jan 14 '20 20:01 mathfac

HI, it looks like one bug in my code. I use quat_from_coeffs with the wrong input format [a, b, c, d], which leads to the wrong agent pose.

StOnEGiggity avatar Jan 16 '20 05:01 StOnEGiggity

Here is another failure case, where geodesic_distance return inf. I reset the start position and rotation in this episode, which are the same position I find inf. It looks like the agent has come to a blind alley.

The episode id is 40356 and scene id is data/scene_datasets/gibson/Annawan.glb. The current position is [-1.410899, 0.17282124, -4.331585]. The current rotation is [0, -0.796320974826813, 0, -0.604875206947327]. The goal position is [0.9884241819381714, 0.17282123863697052, -6.123152732849121].

screenshot

StOnEGiggity avatar Jan 17 '20 10:01 StOnEGiggity

It's a little hard to tell, but this doesn't seem to be a bug. The geodesic distance between those locations is infinite, i.e. there is no navigable path from blue to red. Any large flat(ish) surface gets marked as navigable, so there will be a number of different places the large can stand that are not connected to the floor.

erikwijmans avatar Jan 17 '20 19:01 erikwijmans

Thanks a lot. So we can't know whether if the agent is in a different place before the geodesic distance returns inf. The solution is to go back to the previous position when I use "TELEPORT" action. I don't know if I understand correctly.

StOnEGiggity avatar Jan 18 '20 06:01 StOnEGiggity

Internally, we have a check to see if there is a feasible path between two points (just checking to see if two points are part of the same connect component). That could be used to limit the agent to only be able to teleport to locations that are part of the same connect component. We currently have https://aihabitat.org/docs/habitat-sim/habitat_sim.nav.PathFinder.html#island_radius exposed that queries part of that system. You can use that filter things (the probability that two separate connect components have the exact same size is basically zero), i.e. pathfinder.island_radius(current_position) == pathfinder.island_radius(teleport_position).

We can expose the has_path check if this fits your need.

erikwijmans avatar Jan 18 '20 19:01 erikwijmans