distributed_reinforcement_learning
distributed_reinforcement_learning copied to clipboard
implementation of distributed reinforcement learning with distributed tensorflow
Implementation of Distributed Reinforcement Learning with Tensorflow
Information
- 20 actors with 1 learner.
- Tensorflow implementation with
distributed tensorflow
of server-client architecture. -
Recurrent Experience Replay in Distributed Reinforcement Learning
is implemented in Breakout-Deterministic-v4 with POMDP(Observation not provided with 20% probability)
Dependency
opencv-python
gym[atari]
tensorboardX
tensorflow==1.14.0
Implementation
- [x] Asynchronous Methods for Deep Reinforcement Learning
- [x] IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
- [x] DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
- [x] Recurrent Experience Replay in Distributed Reinforcement Learning
How to Run
- A3C: Asynchronous Methods for Deep Reinforcement Learning
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 19
- Ape-x: DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
python train_apex.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 19
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
python train_impala.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 19
- R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning
python train_r2d2.py --job_name learner --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 39
Reference
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
- DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
- Recurrent Experience Replay in Distributed Reinforcement Learning
- deepmind/scalable_agent
- google-research/seed-rl
- Asynchronous_Advatnage_Actor_Critic
- Relational_Deep_Reinforcement_Learning
- Deep Recurrent Q-Learning for Partially Observable MDPs