Discrete-Continuous-VLN icon indicating copy to clipboard operation
Discrete-Continuous-VLN copied to clipboard

Question about the imitation learning strategy in the paper

Open YESAndy opened this issue 4 months ago • 1 comments

Hi Yicong,

I realized that the imitation learning loss you used in the code base is essentially the cross entropy loss between the predicted action and the oracle action which is obtained by selecting the closest waypoint to the goal. However, this oracle action might not be optimal because sometimes the closest waypoint may not be on the ground truth path (reference path in the dataset). like the following pic,

Screenshot 2024-03-07 at 5 52 59 PM

It is likely to cause the agent to loop around the area.

As the waypoint predictor shows very good results, I wonder if you can comment on how the waypoint predictor manages to avoid the above issue.

Many thanks! Andy

YESAndy avatar Mar 08 '24 01:03 YESAndy