gym-duckietown icon indicating copy to clipboard operation
gym-duckietown copied to clipboard

Potentially wrong reward

Open Max-Fu opened this issue 3 years ago • 1 comments

Inside the gym environment, there are two robot speed: self.speed and self.robot_speed; while self.robot_speed is set to a constant, self.speed is the true speed. Yet in the reward function, the function calls self.robot_speed instead of self.speed (check this). I think this creates the reward mis-specification problem (i.e. DDPG learns trivial policy). Can one of the repo creators check if this is indeed an error? Thanks! (I just restarted my run and will check if this solve the issue.)

Max-Fu avatar Nov 21 '20 00:11 Max-Fu

@Max-Fu I think there has not been a lot of test and tuning of that reward function. Please submit a PR if you can improve the current version

CourchesneA avatar Nov 23 '20 15:11 CourchesneA