rl_collision_avoidance icon indicating copy to clipboard operation
rl_collision_avoidance copied to clipboard

How to test the ga3c-cadrl or cadrl in the gym-collision-avoidance environment?

Open Feng-Kaijun opened this issue 3 years ago • 2 comments

Feng-Kaijun avatar Aug 28 '22 03:08 Feng-Kaijun

Hi, I had finished the training of ga3c-cadrl whit TrainPhase1 and TrainPhase2, but i don't understand how to test this policy, in order to fork your experience which is provided in the TABLE 1 of your latest paper (Collision avoidance in pedestrian-rich environments with deep reinforcement learning). And i am working on this survey based on your great work, especially the gym-collision-avoidance environment. Looking forward to your answers!

Feng-Kaijun avatar Aug 28 '22 03:08 Feng-Kaijun

Thanks! That is great you were able to train some new policies. There is some capability to run the trained policy in random scenarios within this repo, but I don't remember exactly how, and it sounds like you'd like to go beyond that anyway.

To test the policy as I did in the paper you referred to, I would suggest using the gym-collision-avoidance repo on its own and run this bash script (i.e., clone a fresh copy of the gym repo in a new place, since the submodule gym env within this repo is probably not as up-to-date).

That bash script should automatically run a bunch of random test scenarios for various policies and numbers of agents, based on this config. You could edit self.POLICIES_TO_TEST to add a new policy key (e.g., GA3C-CADRL-Feng), and then add the corresponding key/value to this dict with the checkpoint path etc. To start, maybe you'll want to simply make the new policy the only element in self.POLICIES_TO_TEST.

This bash script should: for each number of agents, for each policy_to_test, run the same N pre-defined random test scenarios. Each test scenario will contain all agents running the same policy_to_test. Note that this config only runs 4 test cases by default, which is probably good to start with so you can be sure it's set up correctly and logs the results, but eventually increase this number (I think 500 was used in the paper?). If things are working properly, this bash script should generate a bunch of png files of the agent trajectories and also log some pkl files with results/stats of each test episode. I believe this script is what I used to generate the numbers in the table once I had run the experiments -- running 500 experiments for each policy and each number of agents took a while (hours). Unfortunately, this last script seems to have some hard-coded paths and I am not sure it will work right away, but maybe it can be used as a guide.

Please let me know if this works or if you run into other issues!

mfe7 avatar Aug 29 '22 03:08 mfe7