IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures
IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures copied to clipboard
Implementation of Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures
- These results are from only 4 threads. So unstable to train.
- Tensorflow Implementation
- A3C type thread environment training method
- PongDeterministic-v4 environment
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/video.gif)
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/entropy.png)
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/episode_step.png)
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/max_prob.png)
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/pi_loss.png)
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/score.png)
![](https://github.com/RLOpensource/IMPALA-Scalable-Distributed-Deep-RL-with-Importance-Weighted-Actor-Learner-Architectures/raw/master/source/value_loss.png)
Todo
- [x] Only CPU Training method
- [ ] Use Network protocol method
- [ ] Training on GPU, Inference on CPU