HighwayEnv
HighwayEnv copied to clipboard
collect three types of rewards during testing
Hi, I am now working on a project and doing experiments on the highway-fast-v0 environment. I wonder if there is a way to get the three types of rewards in each step during testing time directly, or I should rewrite the library code by myself? Thx!
Hi @zijianh4,
Currently, the several types of rewards (I imagine you mean like safety, efficiency, comfort, etc) are typically summed into a single scalar reward, that is optimised by the agent.
If you want to keep track of these separate terms, a possibility is to add them to the info
field defined in the open gym interface, such that you can access them this way:
>>> obs, reward, done, info = env.step(action)
>>>info
{
"speed": 15.0,
"crashed": False,
"acceleration: 1.1
}
for example.
That requires changing the code to add these additional info fields, yes. They are currently a few written by default, see here: https://github.com/eleurent/highway-env/blob/9d63973da854584fe51b00ccee7b24b1bf031418/highway_env/envs/common/abstract.py#L150
You can add other fields directly there, or for anything env-specific you should rather override the _info() method in your environment.
Hi @zijianh4, Currently, the several types of rewards (I imagine you mean like safety, efficiency, comfort, etc) are typically summed into a single scalar reward, that is optimised by the agent. If you want to keep track of these separate terms, a possibility is to add them to the
info
field defined in the open gym interface, such that you can access them this way:>>> obs, reward, done, info = env.step(action) >>>info { "speed": 15.0, "crashed": False, "acceleration: 1.1 }
for example.
That requires changing the code to add these additional info fields, yes. They are currently a few written by default, see here:
https://github.com/eleurent/highway-env/blob/9d63973da854584fe51b00ccee7b24b1bf031418/highway_env/envs/common/abstract.py#L150
You can add other fields directly there, or for anything env-specific you should rather override the _info() method in your environment.
Hi @eleurent,
Thanks for your reply. I override the _info() function in my testing code like this but it seems not to work. Specifically, I define function before if __name__=='__main__':
like this:
from highway_env.envs.common.action import action_factory, Action, DiscreteMetaAction, ActionType
Observation = np.ndarray
from highway_env.vehicle.controller import ControlledVehicle
from highway_env import utils
def _info_sep_re(self, obs: Observation, action: Action) -> dict:
"""
Return a dictionary of additional information
:param obs: current observation
:param action: current action
:return: info dict
"""
neighbours = self.road.network.all_side_lanes(self.vehicle.lane_index)
lane = self.vehicle.target_lane_index[2] if isinstance(self.vehicle, ControlledVehicle) \
else self.vehicle.lane_index[2]
scaled_speed = utils.lmap(self.vehicle.speed, self.config["reward_speed_range"], [0, 1])
collision_reward = self.config["collision_reward"] * self.vehicle.crashed
right_lane_reward = self.config["right_lane_reward"] * lane / max(len(neighbours) - 1, 1)
high_speed_reward = self.config["high_speed_reward"] * np.clip(scaled_speed, 0, 1)
info = {
"speed": self.vehicle.speed,
"crashed": self.vehicle.crashed,
"action": action,
"collision_reward": collision_reward,
"right_lane_reward": right_lane_reward,
"high_speed_reward": high_speed_reward,
}
try:
info["cost"] = self._cost(action)
except NotImplementedError:
pass
return
and then I set highway_env._info = _info_sep_re
in if __name__=='__main__
. There is no error when override the function but when I try to log "collision_reward", "right_lane_reward" and "high_speed_reward", it doesn't work and there is no key in the dictionary of info. Could you please help me in this issue? Thx!
Shouldn't you just replace return
by return info
?