DDQN.pytorch
DDQN.pytorch copied to clipboard
DDQN inplementation on PLE FlappyBird environment in PyTorch.
Double Deep Q Learning (DDQN) In PyTorch
DDQN inplementation on PLE FlappyBird environment in PyTorch.

DDQN is proposed to solve the overestimation issue of Deep Q Learning (DQN). Apply separate target network to choose action, reducing the correlation of action selection and value evaluation.
Requirement
- Python 3.6
- Pytorch
- Visdom
- PLE (PyGame-Learning-Environment)
- Moviepy
Algorithm

- In this implementation, I update policy network per episode e not per step t.
- Simplify input images for faster convergence.
Usage
- HyperParameter in
config.py
- Train
python main.py --train=True --video_path=./video --logs_path=./logs
- Restore Pretrain Model
python main.py --restore=./pretrain/model-98500.pth
- Visualize loss and reward curve
python -m visdom.server
python visualize.py --logs_path=./logs
Result
- Full Video (with 60 FPS)
- Reward