Multi-Agent-Reinforcement-Learning-papers
Multi-Agent-Reinforcement-Learning-papers copied to clipboard
Multi-Agent Reinforcement Learning (MARL) papers
Multi-Agent Reinforcement Learning papers
This is a collection of Multi-Agent Reinforcement Learning (MARL) papers. Each category is a potential start point for you to start your research. Some papers are listed more than once because they belong to multiple categories.
For MARL papers with code and MARL resources, please refer to MARL Papers with Code and MARL Resources Collection.
I will continually update this repository and I welcome suggestions. (missing important papers, missing categories, invalid links, etc.) This is only a first draft so far and I'll add more resources in the next few months.
This repository is not for commercial purposes.
My email: [email protected]
Overview
- Reviews
- Environments
- Dealing With Credit Assignment Issue
- Policy Gradient
- Communication
- Emergent
- Opponent Modeling
- Game Theoretic
- Hierarchical
- Ad Hoc Teamwork
- League Training
- Curriculum Learning
- Mean Field
- Transfer Learning
- Meta Learning
- Fairness
- Exploration
- Graph Neural Network
- Model-based
- NAS
- Safe Multi-Agent Reinforcement Learning
- From Single-Agent to Multi-Agent
- Discrete-Continuous Hybrid Action Spaces / Parameterized Action Space
- Role
- Diversity
- Sparse Reward
- Large Scale
- DTDE
- Decision Transformer
- Offline MARL
- Generalization
- Adversarial
- Multi-Agent Path Finding
- To be Categorized
- TODO
Reviews
Recent Reviews (Since 2019)
- A Survey and Critique of Multiagent Deep Reinforcement Learning
- An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective
- Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
- A Review of Cooperative Multi-Agent Deep Reinforcement Learning
- Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning
- A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
- Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications
- A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems
Other Reviews (Before 2019)
- If multi-agent learning is the answer, what is the question?
- Multiagent learning is not the answer. It is the question
- Is multiagent deep reinforcement learning the answer or the question? A brief survey Note that A Survey and Critique of Multiagent Deep Reinforcement Learning is an updated version of this paper with the same authors.
- Evolutionary Dynamics of Multi-Agent Learning: A Survey
- (Worth reading although they're not recent reviews.)
Environments
Dealing With Credit Assignment Issue
Value Decomposition
Other Methods
Paper | Code | Accepted at | Year |
---|---|---|---|
COMA:Counterfactual Multi-Agent Policy Gradients | https://github.com/oxwhirl/pymarl | AAAI | 2018 |
LiCA:Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning | https://github.com/mzho7212/LICA | NIPS | 2020 |
Policy Gradient
Paper | Code | Accepted at | Year |
---|---|---|---|
MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments | https://github.com/openai/maddpg | NIPS | 2017 |
COMA:Counterfactual Multi-Agent Policy Gradients | https://github.com/oxwhirl/pymarl | AAAI | 2018 |
IPPO:Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? | 2020 | ||
MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games | https://github.com/marlbenchmark/on-policy | 2021 | |
MAAC:Actor-Attention-Critic for Multi-Agent Reinforcement Learning | https://github.com/shariqiqbal2810/MAAC | ICML | 2019 |
DOP: Off-Policy Multi-Agent Decomposed PolicyGradients | https://github.com/TonghanWang/DOP | ICLR | 2021 |
M3DDPG:Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient | AAAI | 2019 |
Communication
Communication Without Bandwidth Constraint
Communication Under Limited Bandwidth
Emergent
Paper | Code | Accepted at | Year |
---|---|---|---|
Multiagent Cooperation and Competition with Deep Reinforcement Learning | PloS one | 2017 | |
Multi-agent Reinforcement Learning in Sequential Social Dilemmas | 2017 | ||
Emergent preeminence of selfishness: an anomalous Parrondo perspective | Nonlinear Dynamics | 2019 | |
Emergent Coordination Through Competition | 2019 | ||
Biases for Emergent Communication in Multi-agent Reinforcement Learning | NIPS | 2019 | |
Towards Graph Representation Learning in Emergent Communication | 2020 | ||
Emergent Tool Use From Multi-Agent Autocurricula | https://github.com/openai/multi-agent-emergence-environments | ICLR | 2020 |
On Emergent Communication in Competitive Multi-Agent Teams | AAMAS | 2020 | |
QED:Quasi-Equivalence Discovery for Zero-Shot Emergent Communication | 2021 | ||
Incorporating Pragmatic Reasoning Communication into Emergent Language | NIPS | 2020 |
Opponent Modeling
Game Theoretic
Hierarchical
Ad Hoc Teamwork
Paper | Code | Accepted at | Year |
---|---|---|---|
CollaQ:Multi-Agent Collaboration via Reward Attribution Decomposition | https://github.com/facebookresearch/CollaQ | 2020 | |
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems | |||
Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork | |||
Open Ad Hoc Teamwork using Graph-based Policy Learning | https://github.com/uoe-agents/GPL | ICLM | 2021 |
League Training
Paper | Code | Accepted at | Year |
---|---|---|---|
AlphaStar:Grandmaster level in StarCraft II using multi-agent reinforcement learning |
Curriculum Learning
Paper | Code | Accepted at | Year |
---|---|---|---|
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems | |||
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning | https://github.com/starry-sky6688/MARL-Algorithms | AAAI | 2020 |
EPC:Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning | https://github.com/qian18long/epciclr2020 | ICLR | 2020 |
Emergent Tool Use From Multi-Agent Autocurricula | https://github.com/openai/multi-agent-emergence-environments | ICLR | 2020 |
Learning to Teach in Cooperative Multiagent Reinforcement Learning | |||
StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer Learning | |||
Cooperative Multi-agent Control using deep reinforcement learning | https://github.com/sisl/MADRL | AAMAS | 2017 |
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems |
Mean Field
Transfer Learning
Paper | Code | Accepted at | Year |
---|---|---|---|
A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems | |||
Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning |
Meta Learning
Fairness
Exploration
Paper | Code | Accepted at | Year |
---|---|---|---|
EITI/EDTI:Influence-Based Multi-Agent Exploration | https://github.com/TonghanWang/EITI-EDTI | ICLR | 2020 |
MAVEN:Multi-Agent Variational Exploration | https://github.com/starry-sky6688/MARL-Algorithms | NIPS | 2019 |
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning | |||
Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning | |||
Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework | |||
Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory | |||
LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning | https://github.com/yalidu/liir | NIPS | 2019 |
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning |
Graph Neural Network
Paper | Code | Accepted at | Year |
---|---|---|---|
Multi-Agent Game Abstraction via Graph Attention Neural Network | https://github.com/starry-sky6688/MARL-Algorithms | AAAI | 2020 |
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation | |||
Multi-Agent Reinforcement Learning with Graph Clustering | |||
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems |
Model-based
Paper | Code | Accepted at | Year |
---|---|---|---|
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping | 2020 |
NAS
Paper | Code | Accepted at | Year |
---|---|---|---|
MANAS: Multi-Agent Neural Architecture Search | 2019 |
Safe Multi-Agent Reinforcement Learning
Paper | Code | Accepted at | Year |
---|---|---|---|
MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding | 2019 | ||
Safer Deep RL with Shallow MCTS: A Case Study in Pommerman | 2019 |
From Single-Agent to Multi-Agent
Paper | Code | Accepted at | Year |
---|---|---|---|
IQL:Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents | https://github.com/oxwhirl/pymarl | ICML | 1993 |
IPPO:Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? | 2020 | ||
MAPPO:The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games | https://github.com/marlbenchmark/on-policy | 2021 | |
MADDPG:Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments | https://github.com/openai/maddpg | NIPS | 2017 |
Discrete-Continuous Hybrid Action Space / Parameterized Action Space
Role
Paper | Code | Accepted at | Year |
---|---|---|---|
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles | https://github.com/TonghanWang/ROMA | ICML | 2020 |
RODE: Learning Roles to Decompose Multi-Agent Tasks | https://github.com/TonghanWang/RODE | ICLR | 2021 |
Diversity
Paper | Code | Accepted at | Year |
---|---|---|---|
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems | AAMAS | 2021 | |
Q-DPP:Multi-Agent Determinantal Q-Learning | https://github.com/QDPP-GitHub/QDPP | ICML | 2020 |
Diversity is All You Need: Learning Skills without a Reward Function | 2018 | ||
Modelling Behavioural Diversity for Learning in Open-Ended Games | ICML | 2021 | |
Diverse Agents for Ad-Hoc Cooperation in Hanabi | CoG | 2019 | |
Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning | IJCAI | 2020 | |
Quantifying environment and population diversity in multi-agent reinforcement learning | 2021 |
Sparse Reward
Paper | Code | Accepted at | Year |
---|---|---|---|
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems | NIPS | 2021 | |
Individual Reward Assisted Multi-Agent Reinforcement Learning | https://github.com/MDrW/ICML2022-IRAT | ICML | 2022 |
Large Scale
Paper | Code | Accepted at | Year |
---|---|---|---|
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning | https://github.com/starry-sky6688/MARL-Algorithms | AAAI | 2020 |
DTDE
Paper | Code | Accepted at | Year |
---|
Decision Transformer
Paper | Code | Accepted at | Year |
---|---|---|---|
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks | |||
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem |
Offline MARL
Paper | Code | Accepted at | Year |
---|---|---|---|
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks |
Generalization
Paper | Code | Accepted at | Year |
---|---|---|---|
UNMAS: Multiagent Reinforcement Learningfor Unshaped Cooperative Scenarios | https://github.com/James0618/unmas | TNNLS | 2021 |
Adversarial
Paper | Code | Accepted at | Year |
---|---|---|---|
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems | 2022 | ||
Distributed Multi-Agent Deep Reinforcement Learning for Robust Coordination against Noise | 2022 | ||
On the Robustness of Cooperative Multi-Agent Reinforcement Learning | IEEE Security and Privacy Workshops | 2020 | |
Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning | CVPR workshop | 2022 | |
Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient | AAAI | 2019 | |
Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations | NIPS Deep Reinforcement Learning Workshop | 2018 | |
Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods | 2021 |
Multi-Agent Path Finding
- TODO
To be Categorized
Paper | Code | Accepted at | Year |
---|---|---|---|
Mind-aware Multi-agent Management Reinforcement Learning | https://github.com/facebookresearch/M3RL | ICLR | 2019 |
Emergence of grounded compositional language in multi-agent populations | https://github.com/bkgoksel/emergent-language | AAAI | 2018 |
Emergent Complexity via Multi-Agent Competition | https://github.com/openai/multiagent-competition | ICLR | 2018 |
TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning | https://github.com/tencent-ailab/TLeague | 2020 | |
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers | https://github.com/hhhusiyi-monash/UPDeT | ICLR | 2021 |
TODO
- Multi-Agent Path Finding
Citation
If you find this repository useful, please cite our repo:
@misc{chen2021multi,
author={Chen, Hao},
title={Multi-Agent Reinforcement Learning Papers},
year={2021}
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/TimeBreaker/Multi-Agent-Reinforcement-Learning-papers}}
}