xuehy/pytorch-maddpg: A pytorch implementation of MADDPG (multi-agent de...

#+TITLE: An implementation of MADDPG #+AUTHOR: xuehy #+EMAIL: [email protected] #+STARTUP: content

This is a pytorch implementation of [[https://arxiv.org/abs/1706.02275][multi-agent deep deterministic policy gradient algorithm]].

The experimental environment is a modified version of Waterworld based on [[https://github.com/sisl/MADRL][MADRL]].

The main features (different from MADRL) of the modified Waterworld environment are:

evaders and poisons now bounce at the wall obeying physical rules
sizes of the evaders, pursuers and poisons are now the same so that random actions will lead to average rewards around 0.
need exactly n_coop agents to catch food.

if scene rendering is enabled, recommend to install =opencv= through [[https://github.com/conda-forge/opencv-feedstock][conda-forge]].

** two agents, cooperation = 2 The two agents need to cooperate to achieve the food for reward 10.

[[PNG/demo.gif]]

[[PNG/3.png]]

the average

[[PNG/4.png]]

** one agent, cooperation = 1

[[PNG/newplot.png]]

pytorch-maddpg
pytorch-maddpg copied to clipboard