Multi-Agent-Reinforcement-Learning
Multi-Agent-Reinforcement-Learning copied to clipboard
PyTorch implements multi-agent reinforcement learning algorithms, including QMIX, Independent PPO, Centralized PPO, Grid Wise Control, Grid Wise Control+PPO, Grid Wise Control+DDPG.
Abstract
The implementation of multi-agent reinforcement learning algorithm in Pytorch, including: Grid-Wise Control, Qmix, Centralized PPO. Different learning strategies can be specified during training, and model and experimental data can be saved.
Quick Start: Run the main.py script to start training. Please specify all parameters in the config.yaml file (The parameters used in this project are not optimal parameters, please adjust them according to the actual requirement).
Petting Zoo
MPE: Multi Particle Environments (MPE) are a set of communication oriented environment where particle agents can (sometimes) move, communicate, see each other, push each other around, and interact with fixed landmarks.
These environments are from OpenAI’s MPE codebase, with several minor fixes, mostly related to making the action space discrete by default, making the rewards consistent and cleaning up the observation space of certain environments.
The environment applied in this project is Simple Spread (I'm also considering adding other environments in future releases).
data:image/s3,"s3://crabby-images/e86e0/e86e09fd84f51ea5a153d8537fdc69317f92fb36" alt="Env image"
Requirement
Note: The following are suggested versions only, and do not mean the program will not work with other versions.
Name | Version |
---|---|
Python | 3.6.1 |
gym | 0.21.0 |
numpy | 1.19.1 |
PettingZoo | 1.12.0 |
Pytorch | 1.6.0+cu101 |
Corresponding Papers
-
Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI
-
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
-
The Surprising Effectiveness of PPOin Cooperative Multi-Agent Games
Reference
- petting zoo:
@article{terry2020pettingzoo,
Title = {PettingZoo: Gym for Multi-Agent Reinforcement Learning},
Author = {Terry, J. K and Black, Benjamin and Grammel, Nathaniel and Jayakumar, Mario and Hari, Ananth and Sulivan, Ryan and Santos, Luis and Perez, Rodrigo and Horsch, Caroline and Dieffendahl, Clemens and Williams, Niall L and Lokesh, Yashas and Sullivan, Ryan and Ravi, Praveen},
journal={arXiv preprint arXiv:2009.14471},
year={2020}
}