Multi-Agent Reinforcement Learning papers

This is a collection of Multi-Agent Reinforcement Learning (MARL) papers. Each category is a potential start point for you to start your research. Some papers are listed more than once because they belong to multiple categories.

For MARL papers with code and MARL resources, please refer to MARL Papers with Code and MARL Resources Collection.

I will continually update this repository and I welcome suggestions. (missing important papers, missing categories, invalid links, etc.) This is only a first draft so far and I'll add more resources in the next few months.

This repository is not for commercial purposes.

My email: [email protected]

Overview

Reviews

Recent Reviews (Since 2019)

Other Reviews (Before 2019)

If multi-agent learning is the answer, what is the question?
Multiagent learning is not the answer. It is the question
Is multiagent deep reinforcement learning the answer or the question? A brief survey Note that A Survey and Critique of Multiagent Deep Reinforcement Learning is an updated version of this paper with the same authors.
Evolutionary Dynamics of Multi-Agent Learning: A Survey
(Worth reading although they're not recent reviews.)

Environments

Environment	Paper	Code	Accepted at	Year
StarCraft	The StarCraft Multi-Agent Challenge	https://github.com/oxwhirl/smac	NIPS	2019
StarCraft	SMACv2: A New Benchmark for Cooperative Multi-Agent Reinforcement Learning	https://github.com/oxwhirl/smacv2		2022
StarCraft	Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks	https://github.com/uoe-agents/epymarl	NIPS	2021
Football	Google Research Football: A Novel Reinforcement Learning Environment	https://github.com/google-research/football	AAAI	2020
PettingZoo	PettingZoo: Gym for Multi-Agent Reinforcement Learning	https://github.com/Farama-Foundation/PettingZoo	NIPS	2021
Melting Pot	Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot	https://github.com/deepmind/meltingpot	ICML	2021
MuJoCo	MuJoCo: A physics engine for model-based control	https://github.com/deepmind/mujoco	IROS	2012
MALib	MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning	https://github.com/sjtu-marl/malib		2021
MAgent	MAgent: A many-agent reinforcement learning platform for artificial collective intelligence	https://github.com/Farama-Foundation/MAgent	AAAI	2018
Neural MMO	Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents	https://github.com/openai/neural-mmo		2019
MPE	Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments	https://github.com/openai/multiagent-particle-envs	NIPS	2017
Pommerman	Pommerman: A multi-agent playground	https://github.com/MultiAgentLearning/playground		2018

Dealing With Credit Assignment Issue

Value Decomposition

Paper	Code	Accepted at	Year
VDN：Value-Decomposition Networks For Cooperative Multi-Agent Learning	https://github.com/oxwhirl/pymarl	AAMAS	2017
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning	https://github.com/oxwhirl/pymarl	ICML	2018
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning	https://github.com/oxwhirl/pymarl	ICML	2019
NDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization	https://github.com/TonghanWang/NDQ	ICLR	2020
CollaQ：Multi-Agent Collaboration via Reward Attribution Decomposition	https://github.com/facebookresearch/CollaQ		2020
SQDDPG：Shapley Q-Value: A Local Reward Approach to Solve Global Reward Games	https://github.com/hsvgbkhgbv/SQDDPG	AAAI	2020
QPD：Q-value Path Decomposition for Deep Multiagent Reinforcement Learning		ICML	2020
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning	https://github.com/oxwhirl/wqmix	NIPS	2020
QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning			2020
QPLEX: Duplex Dueling Multi-Agent Q-Learning	https://github.com/wjh720/QPLEX	ICLR	2021

Other Methods

Paper	Code	Accepted at	Year
COMA：Counterfactual Multi-Agent Policy Gradients	https://github.com/oxwhirl/pymarl	AAAI	2018
LiCA：Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning	https://github.com/mzho7212/LICA	NIPS	2020

Policy Gradient

Paper	Code	Accepted at	Year
MADDPG：Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments	https://github.com/openai/maddpg	NIPS	2017
COMA：Counterfactual Multi-Agent Policy Gradients	https://github.com/oxwhirl/pymarl	AAAI	2018
IPPO：Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?			2020
MAPPO：The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games	https://github.com/marlbenchmark/on-policy		2021
MAAC：Actor-Attention-Critic for Multi-Agent Reinforcement Learning	https://github.com/shariqiqbal2810/MAAC	ICML	2019
DOP: Off-Policy Multi-Agent Decomposed PolicyGradients	https://github.com/TonghanWang/DOP	ICLR	2021
M3DDPG：Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient		AAAI	2019

Communication

Communication Without Bandwidth Constraint

Paper	Code	Accepted at	Year
CommNet：Learning Multiagent Communication with Backpropagation	https://github.com/facebookarchive/CommNet	NIPS	2016
BiCNet：Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games	https://github.com/Coac/CommNet-BiCnet		2017
VAIN: Attentional Multi-agent Predictive Modeling		NIPS	2017
IC3Net：Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks	https://github.com/IC3Net/IC3Net		2018
VBC：Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control		NIPS	2019
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation			2018
NDQ：Learning Nearly Decomposable Value Functions Via Communication Minimization NDQ: Learning Nearly Decomposable Value Functions Via Communication Minimization	https://github.com/TonghanWang/NDQ	ICLR	2020
RIAL/RIDL：Learning to Communicate with Deep Multi-Agent Reinforcement Learning	https://github.com/iassael/learning-to-communicate	NIPS	2016
ATOC：Learning Attentional Communication for Multi-Agent Cooperation		NIPS	2018
Fully decentralized multi-agent reinforcement learning with networked agents	https://github.com/cts198859/deeprl_network	ICML	2018
TarMAC: Targeted Multi-Agent Communication		ICML	2019

Communication Under Limited Bandwidth

Paper	Accepted at	Year
SchedNet：Learning to Schedule Communication in Multi-Agent Reinforcement learning		2019
Learning Multi-agent Communication under Limited-bandwidth Restriction for Internet Packet Routing		2019
Gated-ACML：Learning Agent Communication under Limited Bandwidth by Message Pruning	AAAI	2020
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach	ICML	2020
Coordinating Multi-Agent Reinforcement Learning with Limited Communication	AAMAS	2013

Emergent

Paper	Code	Accepted at	Year
Multiagent Cooperation and Competition with Deep Reinforcement Learning		PloS one	2017
Multi-agent Reinforcement Learning in Sequential Social Dilemmas			2017
Emergent preeminence of selfishness: an anomalous Parrondo perspective		Nonlinear Dynamics	2019
Emergent Coordination Through Competition			2019
Biases for Emergent Communication in Multi-agent Reinforcement Learning		NIPS	2019
Towards Graph Representation Learning in Emergent Communication			2020
Emergent Tool Use From Multi-Agent Autocurricula	https://github.com/openai/multi-agent-emergence-environments	ICLR	2020
On Emergent Communication in Competitive Multi-Agent Teams		AAMAS	2020
QED：Quasi-Equivalence Discovery for Zero-Shot Emergent Communication			2021
Incorporating Pragmatic Reasoning Communication into Emergent Language		NIPS	2020

Opponent Modeling

Paper	Code	Accepted at	Year
Bayesian Opponent Exploitation in Imperfect-Information Games
LOLA：Learning with Opponent-Learning Awareness
Variational Autoencoders for Opponent Modeling in Multi-Agent Systems
Stable Opponent Shaping in Differentiable Games
Opponent Modeling and Strategic Reasoning in the Real-time Strategy Game Starcraft
Opponent Modeling in Deep Reinforcement Learning	https://github.com/hhexiy/opponent	ICML	2016
Game Theory-Based Opponent Modeling in Large Imperfect-Information Games

Game Theoretic

Paper	Code	Accepted at	Year
α-Rank: Multi-Agent Evaluation by Evolution
α^α -Rank: Practically Scaling α-Rank through Stochastic Optimisation
A Game Theoretic Framework for Model Based Reinforcement Learning
Fictitious Self-Play in Extensive-Form Games
An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning
Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
Real World Games Look Like Spinning Tops
PSRO: A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems
Neural Replicator Dynamics: Multiagent Learning via Hedging Policy Gradients

Hierarchical

Paper	Code	Accepted at	Year
Hierarchical multi-agent reinforcement learning
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery
Hierarchical Critics Assignment for Multi-agent Reinforcement Learning
Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction
HAMA：Multi-Agent Actor-Critic with Hierarchical Graph Attention Network

Ad Hoc Teamwork

Paper	Code	Accepted at	Year
CollaQ：Multi-Agent Collaboration via Reward Attribution Decomposition	https://github.com/facebookresearch/CollaQ		2020
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems
Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork
Open Ad Hoc Teamwork using Graph-based Policy Learning	https://github.com/uoe-agents/GPL	ICLM	2021

League Training

Paper	Code	Accepted at	Year
AlphaStar：Grandmaster level in StarCraft II using multi-agent reinforcement learning

Curriculum Learning

Paper	Code	Accepted at	Year
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning	https://github.com/starry-sky6688/MARL-Algorithms	AAAI	2020
EPC：Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning	https://github.com/qian18long/epciclr2020	ICLR	2020
Emergent Tool Use From Multi-Agent Autocurricula	https://github.com/openai/multi-agent-emergence-environments	ICLR	2020
Learning to Teach in Cooperative Multiagent Reinforcement Learning
StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer Learning
Cooperative Multi-agent Control using deep reinforcement learning	https://github.com/sisl/MADRL	AAMAS	2017
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

Mean Field

Paper	Code	Accepted at	Year
Mean Field Multi-Agent Reinforcement Learning
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning
Bayesian Multi-type Mean Field Multi-agent Imitation Learning

Transfer Learning

Paper	Code	Accepted at	Year
A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems
Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning

Meta Learning

Paper	Code	Accepted at	Year
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

Fairness

Paper	Code	Accepted at	Year
FEN：Learning Fairness in Multi-Agent Systems
Fairness in Multiagent Resource Allocation with Dynamic and Partial Observations
Fairness in Multi-agent Reinforcement Learning for Stock Trading

Exploration

Paper	Code	Accepted at	Year
EITI/EDTI：Influence-Based Multi-Agent Exploration	https://github.com/TonghanWang/EITI-EDTI	ICLR	2020
MAVEN：Multi-Agent Variational Exploration	https://github.com/starry-sky6688/MARL-Algorithms	NIPS	2019
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning
Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework
Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory
LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning	https://github.com/yalidu/liir	NIPS	2019
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

Graph Neural Network

Paper	Code	Accepted at	Year
Multi-Agent Game Abstraction via Graph Attention Neural Network	https://github.com/starry-sky6688/MARL-Algorithms	AAAI	2020
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation
Multi-Agent Reinforcement Learning with Graph Clustering
Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems

Model-based

Paper	Code	Accepted at	Year
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping			2020

NAS

Paper	Code	Accepted at	Year
MANAS: Multi-Agent Neural Architecture Search			2019

Safe Multi-Agent Reinforcement Learning

Paper	Code	Accepted at	Year
MAMPS: Safe Multi-Agent Reinforcement Learning via Model Predictive Shielding			2019
Safer Deep RL with Shallow MCTS: A Case Study in Pommerman			2019

From Single-Agent to Multi-Agent

Paper	Code	Accepted at	Year
IQL：Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents	https://github.com/oxwhirl/pymarl	ICML	1993
IPPO：Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?			2020
MAPPO：The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games	https://github.com/marlbenchmark/on-policy		2021
MADDPG：Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments	https://github.com/openai/maddpg	NIPS	2017

Discrete-Continuous Hybrid Action Space / Parameterized Action Space

Paper	Accepted at	Year
Deep Reinforcement Learning in Parameterized Action Space		2015
DMAPQN: Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces	IJCAI	2019
H-PPO: Hybrid actor-critic reinforcement learning in parameterized action space	IJCAI	2019
P-DQN: Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space		2018

Role

Paper	Code	Accepted at	Year
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles	https://github.com/TonghanWang/ROMA	ICML	2020
RODE: Learning Roles to Decompose Multi-Agent Tasks	https://github.com/TonghanWang/RODE	ICLR	2021

Diversity

Paper	Code	Accepted at	Year
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems		AAMAS	2021
Q-DPP：Multi-Agent Determinantal Q-Learning	https://github.com/QDPP-GitHub/QDPP	ICML	2020
Diversity is All You Need: Learning Skills without a Reward Function			2018
Modelling Behavioural Diversity for Learning in Open-Ended Games		ICML	2021
Diverse Agents for Ad-Hoc Cooperation in Hanabi		CoG	2019
Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning		IJCAI	2020
Quantifying environment and population diversity in multi-agent reinforcement learning			2021

Sparse Reward

Paper	Code	Accepted at	Year
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems		NIPS	2021
Individual Reward Assisted Multi-Agent Reinforcement Learning	https://github.com/MDrW/ICML2022-IRAT	ICML	2022

Large Scale

Paper	Code	Accepted at	Year
From Few to More: Large-Scale Dynamic Multiagent Curriculum Learning	https://github.com/starry-sky6688/MARL-Algorithms	AAAI	2020

DTDE

Paper	Code	Accepted at	Year

Decision Transformer

Paper	Code	Accepted at	Year
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Offline MARL

Paper	Code	Accepted at	Year
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks

Generalization

Paper	Code	Accepted at	Year
UNMAS: Multiagent Reinforcement Learningfor Unshaped Cooperative Scenarios	https://github.com/James0618/unmas	TNNLS	2021

Adversarial

Paper	Accepted at	Year
Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems		2022
Distributed Multi-Agent Deep Reinforcement Learning for Robust Coordination against Noise		2022
On the Robustness of Cooperative Multi-Agent Reinforcement Learning	IEEE Security and Privacy Workshops	2020
Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning	CVPR workshop	2022
Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient	AAAI	2019
Multi-agent Deep Reinforcement Learning with Extremely Noisy Observations	NIPS Deep Reinforcement Learning Workshop	2018
Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods		2021

Multi-Agent Path Finding

TODO

To be Categorized

Paper	Code	Accepted at	Year
Mind-aware Multi-agent Management Reinforcement Learning	https://github.com/facebookresearch/M3RL	ICLR	2019
Emergence of grounded compositional language in multi-agent populations	https://github.com/bkgoksel/emergent-language	AAAI	2018
Emergent Complexity via Multi-Agent Competition	https://github.com/openai/multiagent-competition	ICLR	2018
TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning	https://github.com/tencent-ailab/TLeague		2020
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers	https://github.com/hhhusiyi-monash/UPDeT	ICLR	2021

TODO

Multi-Agent Path Finding

Citation

If you find this repository useful, please cite our repo:

@misc{chen2021multi,
  author={Chen, Hao},
  title={Multi-Agent Reinforcement Learning Papers},
  year={2021}
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/TimeBreaker/Multi-Agent-Reinforcement-Learning-papers}}
}

Multi-Agent-Reinforcement-Learning-papers Multi-Agent-Reinforcement-Learning-papers copied to clipboard

Metadata