self-imitation-learning-pytorch
self-imitation-learning-pytorch copied to clipboard
This is the pytorch implementation of ICML 2018 paper - Self-Imitation Learning.
Self-Imitation-Learning with A2C
This is the pytorch version of the A2C + SIL - which is basiclly the same as the openai baselines. The paper could be found Here.
TODO List
- [ ] Add PPO with SIL
- [ ] Add more results
Requirements
- python-3.5.2
- openai-baselines
- pytorch-0.4.0
Installation
Install OpenAI Baselines (Need to use the previous version of openai-baselines, will solve in the future.)
# clone the openai baselines
git clone https://github.com/openai/baselines.git
cd baselines
git checkout 366f486
pip install -e .
How to use the code
Train the network:
python train.py --env-name 'PongNoFrameskip-v4' --cuda (if you have the GPU)
Test the network:
python demo.py --env-name 'PongNoFrameskip-v4'
You could also try the A2C algorithm without SIL by adding flag --no-sil
:
python train.py --env-name 'PongNoFrameskip-v4' --cuda --no-sil
Training Performance
Because of time, I just run Pong with 2 million steps. The results of MontezumaRevenge will be uploaded later!
Another results for the Freeway which is correspond with the original paper.