Implementation of Distributed Reinforcement Learning with Tensorflow

Information

20 actors with 1 learner.
Tensorflow implementation with distributed tensorflow of server-client architecture.
Recurrent Experience Replay in Distributed Reinforcement Learning is implemented in Breakout-Deterministic-v4 with POMDP(Observation not provided with 20% probability)

Dependency

opencv-python
gym[atari]
tensorboardX
tensorflow==1.14.0

Implementation

How to Run

A3C: Asynchronous Methods for Deep Reinforcement Learning

CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0

CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 19

Ape-x: DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY

python train_apex.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 19

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

python train_impala.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 19

R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning

python train_r2d2.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 39

distributed_reinforcement_learning
distributed_reinforcement_learning copied to clipboard

Metadata

Implementation of Distributed Reinforcement Learning with Tensorflow

Information

Dependency

Implementation

How to Run

Reference

← Metadata

Owner

Metadata

distributed_reinforcement_learning distributed_reinforcement_learning copied to clipboard

Metadata

Implementation of Distributed Reinforcement Learning with Tensorflow

Information

Dependency

Implementation

How to Run

Reference

← Metadata

Owner

Metadata

distributed_reinforcement_learning
distributed_reinforcement_learning copied to clipboard