dm_control icon indicating copy to clipboard operation
dm_control copied to clipboard

Reward is always 0 in Hopper

Open mertalbaba opened this issue 1 year ago • 1 comments

In Hopper, reward is always 0 independent of the action. The problem is that this code which is used in reward calculation in hopper.py get_reward(self, physics) function, always return 0 as the value of standing: standing = rewards.tolerance(physics.height(), (_STAND_HEIGHT, 2))

When I check env._physics.height() after environment reset, I saw that the height is initialized as a negative value always. Therefore it is impossible for this code fragment rewards.tolerance(physics.height(), (_STAND_HEIGHT, 2)) to return 1, since _STAND_HEIGHT is defined as 0.6 for Hopper.

mertalbaba avatar Jul 10 '23 13:07 mertalbaba

In Hopper, reward is always 0 independent of the action. The problem is that this code which is used in reward calculation in hopper.py get_reward(self, physics) function, always return 0 as the value of standing: standing = rewards.tolerance(physics.height(), (_STAND_HEIGHT, 2))

When I check env._physics.height() after environment reset, I saw that the height is initialized as a negative value always. Therefore it is impossible for this code fragment rewards.tolerance(physics.height(), (_STAND_HEIGHT, 2)) to return 1, since _STAND_HEIGHT is defined as 0.6 for Hopper.

Hi! did you solve the problem? much appreciation

LiuZhenxian123 avatar Mar 10 '24 04:03 LiuZhenxian123