GibsonEnv
GibsonEnv copied to clipboard
train/enjoy_husky_gibson_flagrun.py issues!
Hello everyone,
I have been studying about husky flagrun algorithms for a long time. I have some problems about it. Despite of trying everyhing, agent can not able to learn how to go to cube(target).
-
First of all, I couldn't understand the reward function which contains alive_score,progress and obstacle_dist only. There is no any close_to_target option to go target.
-
Second thing, The target location does not change in any file. There is only two line in _flagreposition as self.walk_to_target = ballxyz. It seems that not contribute to reward function and learning process.
-
The last thing, there is a sentence in the paper: "We trained a perceptual and non-perceptual husky agent according to the setting in Sec. 4.1 with PPO [78] for 150 episodes (300 iterations, 150k frames)." Is the true calculation 150k frame/300 Iteration = 500 Timesteps*Batch ? Timesteps and batch multiplication seems too low.
If I took answers to questions, I would be grateful to you. Thanks.
-
Reward: alive_score is a reward function to prevent agent from tipping over; progress is the difference of the potential function for two consecutive timesteps (dense reward); obstacle distance penalize going too close to an obstacle.
-
The target location is changed in
_flag_reposition(), in that function a random force is applied to the red cube and throws it within the room, in this way the target location is changed. -
The policy is able to converge with a small number of environment steps because it receives ground truth localization, i.e. the agent knows where the target is and only needs to perform local planning/obstacle avoidance.
Can you plot your reward curve during your training process? This would be insightful! Thanks.
Can you plot your reward curve during your training process? This would be insightful! Thanks.
Thank you for your quick response Fei. You are awesome :) I know that the rewards but according to the enjoy results, the agent couldn't go the target. I tried also training with adding self.robot.set_target_position(ball_xyz). Anyway, I will plot my results a few minutes later. Thank you.

Timesteps:600, Episode:20, Iterations:250