HighwayEnv icon indicating copy to clipboard operation
HighwayEnv copied to clipboard

Graph Reinforcement Learning implementation

Open redfish9 opened this issue 1 year ago • 2 comments

Thanks @eleurent for your awsome work!

I'm currently working with this framework and I'm wondering is highway_env compatible of handling multi-agent Graph Reinforcement Learning? In this case, the observation space is the product of the observation matrix O, adjacency matrix A, and mask matrix M.

Moreover, I'm also curious about the implementation of different behavior types of non-agent vehicles. What's your recommended approach for this? I'm considering creating separate classes at /highway_env/vehicle/behavior.py. I would greatly appreciate any tips or advice you could provide.

redfish9 avatar Jan 21 '24 06:01 redfish9

In terms of observation, does the "nearby vehicles" in KinematicObservation imply a similar operation like ajacenct matrix?

redfish9 avatar Jan 21 '24 07:01 redfish9

highway_env compatible of handling multi-agent Graph Reinforcement Learning? In this case, the observation space is the product of the observation matrix O, adjacency matrix A, and mask matrix M

It can probably be made compatible, but no as of now there is no support for this specific format

Moreover, I'm also curious about the implementation of different behavior types of non-agent vehicles. What's your recommended approach for this? I'm considering creating separate classes at /highway_env/vehicle/behavior.py

Yes that sounds right.

In terms of observation, does the "nearby vehicles" in KinematicObservation imply a similar operation like ajacenct matrix?

I'm not super familiar with the multi-agent graph exact model you are referring to, but yes it's probably similar as , given an observer vehicle, we iterate through all vehicles and compute some observations (e.g. could be relative like distance, angle, or absolute speed/position). Right now the focus is mostly on single agent scenarios so you get an array of [vehicles, features], and in the simple multi-agent extension which is current implemented you get a tuple such arrays. If every vehicle is an agent, this can be turned into an adjacency matrix/tensor.

eleurent avatar Jan 26 '24 14:01 eleurent

So grateful for your quick response! Now I'm clear about the logic behind.

redfish9 avatar Jan 29 '24 04:01 redfish9