HighwayEnv
HighwayEnv copied to clipboard
Issues with multi-agent settings
Dear author, I am implementing the Multi-agent settings using the Highway-v0. I am not able to achieve stable training and the vehicles can run off the roads without terminating the environment. I took a look at the codes, in the reward function https://github.com/Farama-Foundation/HighwayEnv/blob/7415379bb741d993557c88d50577df0de190959d/highway_env/envs/highway_env.py#L117-L135 and terminate function https://github.com/Farama-Foundation/HighwayEnv/blob/7415379bb741d993557c88d50577df0de190959d/highway_env/envs/highway_env.py#L136-L142 It seems Only the self.vehicle is considered instead of self.controlled_vehicles. Any thoughts would be appreciated.
As far as I can see, it is necessary to implement a separate multi-agent version of the single agent highway-env
along with specifying the multi-agent action and observation spaces in config. It looks like IntersectionEnv
is implemented with multi-agent consideration, but other envs must be extended explicitly for multi-agent scenarios.
Hey,when trying to run my highway script on multi-agent settings, I run into this error: " File ~.conda\envs\spyder\Lib\site-packages\stable_baselines3\common\base_class.py:180 in init assert isinstance(self.action_space, supported_action_spaces), (
AssertionError: The algorithm only supports (<class 'gymnasium.spaces.discrete.Discrete'>,) as action spaces but Tuple(Discrete(5), Discrete(5)) was provided"
Did you encounter the same error too? How did you solve the issue? Here is my env config: config= {"action": { "type": "MultiAgentAction", "action_config":{ "type":"DiscreteMetaAction", "longitudinal": True, "lateral": True, "target_speeds": [50, 60, 70, 80], },
},
"observation":{
"type":"MultiAgentObservation",
"observation_config":{
"type": "Kinematics",
"vehicles_count": 8,
"features": [
"presence",
"x",
"y",
"vx",
"vy",
"cos_h",
"sin_h"
],
"absolute": False
},
},
"lanes_count": 3, "vehicles_count": 10, "controlled_vehicles": 2, "collision_reward": -1, "right_lane_reward": 0, "high_speed_reward": 1, "lane_change_reward": 0.1, "reward_speed_range": [20, 30]},render_mode="rgb_array")
This looks like a separate issue. You should check the algorithm that you are using. The RL algorithm (from stable baselines 3) that you are using seem to support only single agent. Either you need to modify the algorithm for multi-agent settings or use the multi-agent version of the RL algorithms available in ray-rllib or other alternatives options to train multiple agents.